yeatmanlab / pyAFQ

Automated Fiber Quantification ... in Python
http://yeatmanlab.github.io/pyAFQ/
BSD 2-Clause "Simplified" License
53 stars 34 forks source link

Revert #1086? #1106

Open arokem opened 5 months ago

arokem commented 5 months ago

Some bleeding-edge users have reported that they dislike the changes introduced in #1086, that now use full tract names in the file-names of the results. One option would be to revert that PR and go back to cryptic acronyms, but provide a table in the documentation that lays out what these acronyms mean (and that we can copy to every paper that we write using the software). OTOH, arguably the new filenames are not that much longer than a BIDSish filename is anyway. For example:

sub-100206_coordsys-RASMM_trkmethod-probCSD_recogmethod-AFQ_desc-LeftCorticospinal_tractography.trk

vs.

sub-100206_coordsys-RASMM_trkmethod-probCSD_recogmethod-AFQ_desc-LCST_tractography.trk

So, I am not strongly pro or con here, and would love to hear users/developers chiming in.

36000 commented 5 months ago

For now there is a dictionary to convert between the two here: #1107

For me, I think we should keep the longer names, for these reasons: I think that the filenames are not that much longer there are diffusion MRI researchers who are familiar with major tracts but are not necessarily familiar with the AFQ acronyms, and we want people like that to be able to read our papers more easily the smaller acronyms are harder for new people to memorize than the longer names, which have some meaning in them (from what I have seen) it makes code and documentation more readable for new people

The counter arguments I have seen are: People who use AFQ are already familiar with these acronyms It's harder to compare results across versions, breaking consistency and requiring additional time/code They are shorter and easier to type, taking less space in code and filenames

36000 commented 5 months ago

One thing you could also do is have abbreviated tract names that are not just acronyms, like: L InfFroOccipital L SupLong L PostArcuate etc.

arokem commented 5 months ago

That reads like a 🦓 mixed with a 🦒 to me! 😄

mayayab commented 5 months ago

The longer names don't only affect the filenames but also the tract names in the the csv files, so lead to longer variable names when filtering and wrangling dataframes. I think that for plotting it really depends on the plot, I can see that for a figure that has a single tract you could want the full name but for figures that include multiple tracts (very common use case?) it actually makes more sense to use the acronyms because the full names won't fit. I see that people can have different preferences for their figures, so someone has to be unhappy and end up renaming their variables :) Personally I'm in favor of prioritizing consistency - this is a software that some labs have been using for a while now and it's confusing and breaking downstream code if different versions have different filenames. It makes comparing versions more complicated than it ought to be (for example before and after adding the exclusion IFOF ROI, before and after fixing the Mahalanobis distance, all things we did in the last month). To me this is the main argument. I think that providing a simple table that maps each acronym to the full name can assist new users.

arokem commented 4 months ago

I am starting to come around to @mayayab's position. I think that the cost in complicated code with long string names is higher than the cost for new users of looking up the meaning of these rather well-established acronyms. So, I'm starting to tend in the direction of reverting that change and then adding a table to the documentation that provides the mapping in #1107

36000 commented 4 months ago

What if we saved both the acronyms and the full names in the resulting tract profiles CSV file?

arokem commented 4 months ago

I think that the crux of the current issue is what the filenames would be. If we have a dictionary/table that converts acronyms to full names somewhere in the software, we could use that anywhere we'd like. But filenames are going to be long and complicated if they are long and complicated.