elliohow / fMRI_ROI_Analysis_Tool

An analysis tool that uses per-voxel statistical maps in conjunction with FSL atlases to create per-region statistical maps. Current usage includes the creation of regional maps of temporal signal to noise ratio.
Apache License 2.0
11 stars 4 forks source link

[JOSS REVIEW] Folder selection #44

Closed ZeitgeberH closed 1 year ago

ZeitgeberH commented 1 year ago

Summary

Hi, I went through the Running the ROI analysis part. The output folder is missing several subfolders mentioned in that section (see screen shot): Figures, fRAT_report, Statistics

I noticed the terminal output stating that the statmaps was not specified. This may be the cause of it, although i did the run the statmap part without problem. I think i may miss-specified the outfolder in the analysis process. fRAT ask three or four times for output directory. This may cause confusion and mishaps. I think your way to run the test-suit was very smooth. Is it possible to ask user to specify root folder for a project only at the beginning and let fRAT to handle all the subfolders path. I think that is one of the strength of fRAT, scaffolding project folders to automatically organize multiple data sources and analysis results.

Platform details:

Capture

elliohow commented 1 year ago

Hi, please can you paste the analysis_log.toml file here and I can take a look

ZeitgeberH commented 1 year ago

Here's the .toml file. I see your points. Let me turn on the run_statistics and run_plotting options on and run it again.

General information

version = 1.3.5 config_file_used = 'fRAT_config.toml'

General

run_analysis = true # true or false. Can skip this step if json files have already been created. run_statistics = false # true or false. run_plotting = false # true or false. verbose = true # true or false. verbose_cmd_line_args = false # true or false. multicore_processing = false # true or false. Use multicore processing during analysis? Multicore processing currently works within participants not between them. Recommended: true max_core_usage = 'max' # 'max' to select number of cores available on the system, alternatively an int to manually select number of cores to use. Recommended: 'max' Options: ['max', 6, 5, 4, 3, 2, 1]. brain_file_loc = '' # Either the absolute location of brain files or blank, if blank then a browser window will allow you to search for the files at runtime. If passing in this information as a command line flag, this will be ignored. report_output_folder = '' # Either the absolute location of json files or blank, if blank then a browser window will allow you to search for the files at runtime. If passing in this information as a command line flag, this will be ignored. averaging_type = 'Participant averaged' # Participant averaged or Session averaged. This setting is used to determine which statistics to use for plotting, and when accessing results (for example through the interactive report). Note: Histograms will always use the raw results. The linear mixed model from the statistics will always use session averaged data Options: ['Session averaged', 'Participant averaged']. parameter_file = 'paramValues.csv' # Recommended: paramValues.csv Name of the file to parse for critical params. Option added to allow quick swapping between different parameter files. file_cleanup = 'move' # Move or delete intermediate files. Options: ['move', 'delete'].

Installation testing

delete_test_folder = 'Never' # Option to choose whether the folder generated while running tests is deleted upon completion. This only applies when running the full comparison. Options: ['Always', 'If completed without error', 'Never']. verbose_errors = true # true or false. Print all missing files and differences found during testing to the terminal.

Analysis

atlas_number = 'HarvardOxford-cort' # Options: ['Cerebellum-MNIflirt', 'Cerebellum-MNIfnirt', 'HarvardOxford-cort', 'HarvardOxford-sub', 'JHU-ICBM-labels', 'JHU-ICBM-tracts', 'juelich', 'MNI', 'SMATT-labels', 'STN', 'striatum-structural', 'Talairach-labels', 'Thalamus']. input_folder_name = 'func_cleaned' # Folder found in each subjects directory containing the files to be analysed. func_cleaned is the default option as this folder will automatically be created when making statmaps. If the "Noise volume included in time series" option was set to true, or motion outlier removal was used when creating the statmaps, this folder will contain cleaned versions of the original func files. However if these options were not used when creating the statmaps, the folder will still be present, however the files will be identical to those in the "func" folder. output_folder = 'DEFAULT' # Directory to save output. If set to DEFAULT, output directory will be set to the cortical atlas used appended with "_ROI_report". Example: HarvardOxford-Cortical_ROI_report/ dof = 12 # Degrees of freedom for FLIRT (only used for the fMRI to anatomical alignment when using Correlation Ratio cost function). Recommended: 12 anat_align_cost_function = 'BBR' # BBR or Correlation Ratio. Recommended: BBR. Using BBR (Boundary-Based Registration) requires an FSL FAST segmentation (this will be automatically created if necessary if the Run FSL FAST option is set to "Run if files not found") and a wholehead non-brain extracted anatomical placed in the anat folder. Options: ['BBR', 'Correlation Ratio']. grey_matter_segment = true # true or false. Recommended: true if using a cortical atlas. Note: FSL FAST segmentation files should be placed in the sub-{id}/fslfast/ directory. Only the FSL FAST file appended with pve_1 needs to be in this directory, however if all files output by FAST are placed in this directory, then fRAT will find the necessary file. run_fsl_fast = 'Run if files not found' # Recommended: "Run if files not found". These files will only be searched for (and thus created) if "Use FSL FAST segmentation" is set to true. Options: ['Run if files not found', 'Never run']. fslfast_min_prob = 0.1 # Recommended: 0.1 stat_map_folder = '' # Folder name which contains the statistical map files. Example: temporalSNR_report stat_map_suffix = '_tStd.nii.gz' # File name suffix of the statistical map files. Include the file extension. Example: _tSNR.img conf_level_number = '95%, 1.96' # Set the confidence level for confidence interval calculations. Numbers represent the confidence level and the corresponding critical z value. Recommended: 95%, 1.96. Options: ['80%, 1.28', '85%, 1.44', '90%, 1.64', '95%, 1.96', '98%, 2.33', '99%, 2.58']. binary_params = [''] # Add parameters here which will either be on or off.

Outlier detection

noise_cutoff = false # true or false. Calculate a minimum cutoff value to be included in an ROI,based on voxels not assigned an ROI or that have been excluded from analysis. Voxels with values of 0 are not included when calculating the noise cutoff. Useful for statistical maps where extracranial voxels are likely to have much lower values than those inside the brain such as tSNR maps. Recommended: true. gaussian_outlier_detection = false # true or false. Fit a gaussian to the data to determine outliers using Elliptic Envelope. Recommended: true. gaussian_outlier_contamination = 0.1 # Percent of expected outliers in dataset Recommended: 0.1 gaussian_outlier_location = 'below gaussian' # Data to remove (if gaussian outlier detection is true). For example: if set to below gaussian, data below the gaussian will be removed. Recommended: below gaussian. Options: ['below gaussian', 'above gaussian', 'both'].

Parsing

parameter_dict1 = ['Multiband', 'SENSE'] # Comma-separated list of independent variables. The critical parameter settings are used to supply the names and file name abbreviations of the independent variables, therefore fRAT supports the use of any parameters (and any number of them). As these critical parameters will also be used when labelling the rows and columns of both the violin plots and histograms, they should be written as you want them to appear in these figures. Note: Leave blank if you do not want to compare between different conditions, for example, if you wish to see the overall tSNR for each region across the entire dataset. parameter_dict2 = ['mb', 's'] # Comma-separated list of terms to parse the file name for. Each entry corresponds to a critical parameter above. Optional if using table parameter verification, however if the file name contains this information it can use this information to auto-detect the critical parameters used for each fMRI volume. Note: This field can be blank. make_folder_structure = false # true or false. Make folder structure when creating paramValues.csv parsing_folder = 'func' # Folder to find files to add to paramValues.csv. If using "Make folder structure" option, this will be the directory the files in the participant folder will be moved to.

elliohow commented 1 year ago

You can turn off the analysis option now since it has already been ran, the analysis output folder should be selected now though instead of the root folder. Not sure if that is clear from the documentation

ZeitgeberH commented 1 year ago

That's good feature! Reuse the analysis to twist the plotting and reporting part. Definitely worth to highlight in your doc. I may missed that part of doc.


With analysis off, it gives me a FileNotFoundError FileNotFoundError: File /home/mlk/Documents/fRAT/testData/Overall/Summarised_results/Participant_averaged_results/combined_results.json does not exist.

I check the combined_results.json file, it does exist in that directories. I figure out it was due to that I pointed the saving folder to the root folder.

Pointing it to the output folder HarvardOxford-Cortical_ROI_report works.


The analysis was killed at the step of Saving Participants table

Saving Participants table.
Killed
(frat310) mlk@MingPC:~$ /home/mlk/.local/pipx/venvs/frat-brain/lib/python3.10/site-packages/multiprocess/resource_tracker.py:227: UserWarning: resource_tracker: There appear to be 6 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Not sure what's the cause. I noticed it requires a large RAM to run this step (~20G for your example data; my desktop has 64 G RAM).

elliohow commented 1 year ago

It may be worth reducing the max core usage to 6 as I see it is currently set to max. If that solves the issue I'll add a hard cap into fRAT to disallow usage of more than 6 cores at a time to reduce RAM usage. How many cores and threads does your computer have?

ZeitgeberH commented 1 year ago

It may be worth reducing the max core usage to 6 as I see it is currently set to max. If that solves the issue I'll add a hard cap into fRAT to disallow usage of more than 6 cores at a time to reduce RAM usage. How many cores and threads does your computer have?

24

ZeitgeberH commented 1 year ago

There's type error for function start_processing_pool at line 323 in utils.py you may want to warp it workers = config.max_core_usage with workers = int (config.max_core_usage)

ZeitgeberH commented 1 year ago

Similar error for 6 workers.

Saving Participants table.
Maximum ROI value of: 2 seen for parameter combination: mb4_s1.5. Creating figures with this colourbar limit.
Saving Participants_same_scale table.
Killed
ZeitgeberH commented 1 year ago

Same error for 3 workers.

ZeitgeberH commented 1 year ago

Tried with Multicore processing off, still the same problem. The verbose mode gave out some warnings (see below). You could try to close those saved figures. Maybe that is the culprit.

Saving Sessions table.
/home/mlk/.local/pipx/venvs/frat-brain/lib/python3.10/site-packages/fRAT/utils/figures.py:270: ResourceWarning: unclosed file <_io.BufferedReader name='mb4_s1.5_Sessions.png'>
/home/mlk/.local/pipx/venvs/frat-brain/lib/python3.10/site-packages/fRAT/utils/figures.py:270: ResourceWarning: unclosed file <_io.BufferedReader name='mb1_s2.0_Sessions.png'>
/home/mlk/.local/pipx/venvs/frat-brain/lib/python3.10/site-packages/fRAT/utils/figures.py:270: ResourceWarning: unclosed file <_io.BufferedReader name='mb1_s1.5_Sessions.png'>
/home/mlk/.local/pipx/venvs/frat-brain/lib/python3.10/site-packages/fRAT/utils/figures.py:270: ResourceWarning: unclosed file <_io.BufferedReader name='mb3_s1.5_Sessions.png'>
/home/mlk/.local/pipx/venvs/frat-brain/lib/python3.10/site-packages/fRAT/utils/figures.py:270: ResourceWarning: unclosed file <_io.BufferedReader name='mb3_s2.0_Sessions.png'>
Killed
elliohow commented 1 year ago

Thanks for trying it with different configurations. It has been a bit of an odd one as on Mac this hasn't been an issue and RAM usage barely increases. However think I have found the culprit and have changed the code:

im = Image.open(png_path)
width, height = im.size

To:

with Image.open(png_path) as im:
    width, height = im.size

Ive also added the changed the multicore processing line to workers = int (config.max_core_usage). Hopefully 1.3.6 fixes it!

elliohow commented 1 year ago

I noticed the terminal output stating that the statmaps was not specified. This may be the cause of it, although i did the run the statmap part without problem.

This message occurs to warn that a log of the statmap folder used wont be saved. If there is one statmap folder for each participant and statmap folder to use has not been specified, it will default to using that folder and show that warning. If statmap has not been specified and there are multiple statmap folders, the program will exit with an error.

I think i may miss-specified the outfolder in the analysis process. fRAT ask three or four times for output directory. This may cause confusion and mishaps. I think your way to run the test-suit was very smooth. Is it possible to ask user to specify root folder for a project only at the beginning and let fRAT to handle all the subfolders path. I think that is one of the strength of fRAT, scaffolding project folders to automatically organize multiple data sources and analysis results.

Currently the user can set the locations for the root and report output folders in the GUI. If paths have been given for these settings, the file explorer won't pop up. Although it would also be useful to have the root, statmap and output folders be saved for the current session once those steps have been ran, so i've added this to my list of future additions.

ZeitgeberH commented 1 year ago

Sounds good. After upgrading to version 1.3.6 , I still got the same error without multicore processing. I am closing this issue now, as this may not be a problem for other platform as you observed in your OS.

elliohow commented 1 year ago

Took a while to debug this one. Looks like its a known issue with Matplotlib not being able to clean up plots properly when creating them in a loop, causing a memory leak. Its popped up quite a few times in their issues: https://github.com/matplotlib/matplotlib/issues/20490

Changing the Matplotlib backend to a non-GUI backend fixed this:

import matplotlib
matplotlib.use('Agg')

I'm not sure why it isn't causing issues on Mac, it may be a difference in how Tk handles Matplotlib GUI elements between the two OSs, but i've also seen people report it as an issue when using a Jupyter notebook. 1.3.7 should fix this. Tagging @billbrod in case you also run into this issue.