tramarobin / fctSnPM

Using spm1d package (v.0.4.3), compute anova and post-hoc tests from anova1 to anova3rm, with a non-parametric approach (permutation tests)
GNU General Public License v3.0
2 stars 2 forks source link

[JOSS Review] Long script execution times #4

Closed 0todd0000 closed 3 years ago

0todd0000 commented 3 years ago

Do you know why processing is taking so long?

0todd0000 commented 3 years ago

I discovered one reason why script execution is taking so long: very large files are generated (at least on macos):

Please either (a) ensure that file sizes are smaller on macos, or (b) generate figures without saving and files.

Please refer to issue #6 for a related discussion on output options.

tramarobin commented 3 years ago

On my computer (windows 10, I5 7th gen), the scripts took a little longer than 20 and 5 seconds (40 and 8 secondes since I have added the .fig outputs and did not verify the changes in time and storage...) and 105 and 30 Mo These durations and sizes seem normal as permutation tests require calculation ressources and figures in .tiff at 300ppp took some disk storage too.

in D1_pairedTtest.m, could you tell me the size of the variable Side ? on my computer it is 39 Ko. in D1_pairedTtest.m, could you tell me the size of the figure Side.tif ? on my computer it is 2268 Ko, for a size of 4244 x 2144 pixels and a resolution of 300ppp.

The figures in these example are created with the size of your screen as default. Your screen resolution must be more important than mine and my guess is that the dimension of these files are bigger on your computer.

If it is the case, I will add in the example scripts an input to control the size of the figures in output. Do you think a default value, for example in cm, should be add as default instead of the screen size ?

0todd0000 commented 3 years ago

My display resolution is 5120x2880. The script generates single TIFF files that are approximately 30 MB each. A JPG file with comparable resolution is less than 2 MB.

It would be preferable to ensure that the scripts run as-rapidly-as-possible on all systems. TIFF file generation is peripheral to the main purpose of this package, so if TIFF file generation or file saving produces large files and/or takes a long time, it would be preferable to not generate these files.

Is it possible to alter the scripts so that they generate only figures, and not TIFF files? Or to at least offer users a choice whether they want to write files? Default options should ideally execute the most rapidly.

tramarobin commented 3 years ago

As you wrote in issue #6, the generation of TIF files is a stylistic choice. The idea is to create figures that can be directly used in article, so in .tif at 300ppp.

However, it may be wrong to use these values as default option. In consequence, the default ìmageSizeoption was set to 720*480 pixels and the ìmageResolution was reduced at 96 ppp by default instead of 300 ppp. The TIF files created by default are way much smaller, and computer generate the same files. Besides, they are easily modifiable to be use in scientific papers.

Concerning the long script execution time, the reduction of the size of the TIF files helped a little, but I think the hard drive type (or even the organisation of the hard drive) might be the limitant factor. For instance, I saved the outputs on my current Matlab path (45 sec) or elsewhere on my SSD (70 sec), or on a USB key (75 sec).

0todd0000 commented 3 years ago

As you wrote in issue #6, the generation of TIF files is a stylistic choice. The idea is to create figures that can be directly used in article, so in .tif at 300ppp.

I recommend giving users as much flexibility as possible. Many users like to customize output (e.g. colors, fonts, plot overlays, etc.), and this is not possible with TIFF figures. While I think you will get more users if you produce figure windows and not files, this is a software design choice so in my opinion it needn't be implemented as part of this review process.