SlicerDMRI / whitematteranalysis

White matter tractography clustering and more...
https://dmri.slicer.org/whitematteranalysis/
Other
65 stars 32 forks source link

STYLE: Fix filename and dirname creation floating point formatting #232

Closed jhlegarreta closed 1 week ago

jhlegarreta commented 1 week ago

Fix formatting of numbers that can be floats when creating filenames and dirnames: when transitioning to f-strings (commit ac6040f), it was assumed that sigma and nonrigid_grid_resolution would be integers, based on the existing formatting. However, these attributes can be set to non-integer values. The pre-f-string transition solution, e.g.:

"iteration_%05d_sigma_%05d" % (self.total_iterations, self.sigma)

trimmed the decimal part, and thus, for e.g. total_iterations 20 and sigma 7.5, it would produce:

'iteration_00020_sigma_00007'

With the introduction of f-strings, i.e.

f"iteration_{self.total_iterations:05d}_sigma_{self.sigma:05d}"

this would produce an error, since the sigma floating point value cannot be formatted as an integer:

{ValueError}ValueError("Unknown format code 'd' for object of type
'float'")

This patch set fixes the error by adopting the previous behavior: it trims the decimal part.

AlbertoImg commented 1 week ago

Thanks @jhlegarreta for looking at my question! I am not sure whether int(sigma) is the best option, the .5 in 7.5 is information needed for the report. In my local version, I changed d to f, and now the folder / files names have a "." in all the cases where sigma appears but still with the decimal info.

jhlegarreta commented 1 week ago

I am not sure whether int(sigma) is the best option, the .5 in 7.5 is information needed for the report. In my local version, I changed d to f, and now the folder / files names have a "." in all the cases where sigma appears but still with the decimal info.

I know it is not the best solution, but it is the behavior that it had before... You will agree with me that having a period in the filename/dirname is never a good solution. Happy to consider other solutions if you come up with any.

jhlegarreta commented 1 week ago

@AlbertoImg I am merging this: the current workaround is better than a failure. As said, better strategies can be considered as new PRs.

AlbertoImg commented 1 week ago

A workaround could be to use "f" instead of "d" in the formatting, and later on replacing the dot in the dirname string: For example: dirname = f"iteration_{self.total_iterations:05d}sigma{self.sigma:03d}grid{self.nonrigid_gridresolution:03d}" dirname = dirname.replace(".","")

jhlegarreta commented 1 week ago

@AlbertoImg Please, check the statement: the strings still have the integer specifier. I had already thought about the solution you proposed, but discarded it because the result would not be consistent with what you get when using only integers, e.g.:

total_iterations = 10
sigma = 0.75
nonrigid_grid_resolution = 0.02

dirname = f"iteration_{total_iterations:05d}sigma{sigma:03f}grid{nonrigid_grid_resolution:03f}"
dirname = dirname.replace(".","_")

You get iteration_00010sigma0_750000grid0_020000. As you see, the floating point numbers are appended with zeros, whereas the integer is prepended with them: i.e. would that be preferred over iteration_00010sigma_00075grid_00020? It's perfectly fine from the point of view of the programming and certainly does not lead to confusion as opposed to iteration_00010sigma_00075grid_00020 (i.e. is sigma 75 vs 0.75 or the resolution 20 instead of 0.2?), but I am not sure about the downstream consequences, if any.

AlbertoImg commented 1 week ago

Here "iteration_00010sigma_00075grid00020", maybe confusing when comparing the format of the iterations and the others, but it is just my opinion. Another idea: dirname = f"iteration {totaliterations:04d} sigma {sigma:.2f} grid _ {nonrigid_gridresolution:.2f}" # (spaces between "" added just for visualization in this conversation) dirname = dirname.replace(".","p")

'iteration_0010_sigma_0p75_grid_0p02'

AlbertoImg commented 2 days ago

FYI: congeal_multisubject.py also has same issues. Thanks! Line 149 e.g.

jhlegarreta commented 1 day ago

FYI: congeal_multisubject.py also has same issues. Thanks! Line 149 e.g.

Can you be more specific? The fix for the issue reported in https://github.com/SlicerDMRI/whitematteranalysis/commit/ac6040fcf77ac132d0fc4a071af9d80240f7f577#r143635819 for whitematteranalysis/congeal_to_atlas.py was made extensive to whitematteranalysis/congeal_multisubject.py in this PR, and line 149 on that file does not point to a statement related to this PR: https://github.com/SlicerDMRI/whitematteranalysis/blob/d7c6bd8e3eea33e4efb023daf7aef1bd9e544074/whitematteranalysis/congeal_multisubject.py#L149

AlbertoImg commented 1 day ago

The line number was wrong. I am sorry. The line were 479 and 491 479: register.process_id_string = f"subject{subject_idx:05d}iteration{iteration_count:05d}sigma{sigma:03d}" 491: register.process_id_string = f"subject{subject_idx:05d}iteration{iteration_count:05d}sigma{sigma:03d}grid{grid_resolution:03d}"

jhlegarreta commented 1 day ago

OK. Watch the underscores and markdown highlighting. PR #236.