AFM-SPM / TopoStats

An AFM image analysis program to batch process data and obtain statistics from images
https://afm-spm.github.io/TopoStats/
GNU Lesser General Public License v3.0
57 stars 10 forks source link

Remove saving of gaussian filtered arrays to .npy files #804

Closed ns-rse closed 7 months ago

ns-rse commented 7 months ago

Closes #802

Arrays and data are now saved in HDF5 files (#790) and so .npy arrays are somewhat redundant. This PR leaves the io.save_array() function in place should interactive use require saving of arrays but removes its use from topostats.processing.run_filters() so that the .npy files are no longer saved to disk.

Users wishing to access processed data should load it from the HDF5 formatted .topostats files that are saved during processing.

codecov[bot] commented 7 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (de3ac4b) 84.73% compared to head (ae1334d) 84.72%. Report is 50 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #804 +/- ## ========================================== - Coverage 84.73% 84.72% -0.01% ========================================== Files 21 21 Lines 3196 3195 -1 ========================================== - Hits 2708 2707 -1 Misses 488 488 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

ns-rse commented 7 months ago

Thanks.

Just a small best practice question though - is it typical for a library to have it's own save function like this even though it's basically a wrapper for another library's function (numpy)?

Depends, by making it a wrapper you can add more functionality to a small atomic function (save_array()) and keep it focused on doing one thing and one thing alone, then update the call you make to it from elsewhere in the code base. For example if there is a labelled image but you wanted to convert to a mask for all objects (ignoring the fact we already have these available!) then you could add an option to the def save_array(..., mask: bool = False) and have it call whatever function is used for masking. Then in a place during processing where you do want to mask before saving you would add bool = True and be done. Of course you could do the masking as a step in the processing before calling save_array() but over time the steps in processing would grow and grow (its already pretty long) and so my preference is to add options in this manner.

This is pretty much what has been done with all the refactoring, breaking long chunks of code into smaller functions with options that are then called. It might seem faff when there is no immediate benefit but further down the line it makes it easier to extend functionality I think.