Out of Memory - Githubissues

rprueckl commented 2 years ago

Hi,

I tried to run QuickNAT on a dataset that I have preprocessed using Freesurfer as described in your documentation. My computer is equipped with a NVIDIA GeForce GTX 1650 (4GB RAM). I am using python run.py --mode=eval_bulk to start execution. I have set the batch size to 1, however I am still getting this:

RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 4.00 GiB total capacity; 1.67 GiB already allocated; 0 bytes free; 2.74 GiB reserved in total by PyTorch)

Is it possible to work around this problem somehow or are 4GB RAM simply not enough for executing QuickNAT?

Thanks for your time!

rprueckl commented 2 years ago

Hi again,

today I tried with an RTX 2080 (8GB) with a similar result:

RuntimeError: CUDA out of memory. Tried to allocate 2.06 GiB (GPU 0; 8.00 GiB total capacity; 4.21 GiB already allocated; 0 bytes free; 6.28 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I think in the paper a GPU with 12GB was used?

fhfhfh999 commented 2 years ago

Hi, Seems you have run the code. But I cannot understand how the 'converth5.py' works especially the labels. Could you give me a hint? I think the author of this paper will never answer any questions...

fhfhfh999 commented 2 years ago

The paper mentioned in 2.3: "We use a constant weight decay of 0.0001. Batch size is set to 4, limited by the 12GB RAM of the NVIDIA TITAN X Pascal GPU."

fhfhfh999 commented 2 years ago

The paper mentioned that they use FreeSurfer to handle IXI Dataset. But when I started to learn FreeSurfer, I found that FreeSurfer will not give a single "Auxiliary label". The output of FreeSurfer contains many files including a folder named "label". And I think the folder is not the "label" in "convert_h5.py". So, how to start training?

rprueckl commented 2 years ago

Hi,

I never executed training, only segmentation. I preprocessed my niftis with mri_convert --conform <input.nii> <output.nii>

Regarding executing QuickNAT, my steps are as follows:

install quicknat
- install cuda 11.3
- install python 3.7.9 (x64)
- make sure the correct python is in path
- create a folder D:/quicknat_test
- copy the folder called after the github commit hash to D:/quicknat_test/src
- edit "settings_eval.ini" - change the following: data_dir = "D:/quicknat_test/nifti/process" directory_struct = "Linear" estimate_uncertainty = "True"
- start cmd with admin permissions
- install virtualenv for python (if not already done) pip install virtualenv
- create a virtual environment virtualenv D:/quicknat_test/env
- activate the virtual environment D:\quicknat_test\env\Scripts\activate.bat
- go to the folder D:\quicknat_test\src\4e4e97e912b9f75f9c299065922009da737c4ef9
- install correct torch pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
- install the rest of the dependencies python -m pip install -r requirements.txt
execute quicknat
- copy the preprocessed nifti files into D:/quicknat_test/nifti/process
- edit "test_list.txt" and enter the filenames in the data_dir you want to process
- start cmd with admin permissions
- activate the virtual environment D:\quicknat_test\env\Scripts\activate.bat
- go to the src folder: cd D:\quicknat_test\src\4e4e97e912b9f75f9c299065922009da737c4ef9
- start processing python run.py --mode=eval_bulk
- results under: D:\quicknat_test\src\4e4e97e912b9f75f9c299065922009da737c4ef9\ixi_test_seg\one_view

fhfhfh999 commented 2 years ago

Thanks! I'll try it!

arickm commented 2 years ago

The paper mentioned that they use FreeSurfer to handle IXI Dataset. But when I started to learn FreeSurfer, I found that FreeSurfer will not give a single "Auxiliary label". The output of FreeSurfer contains many files including a folder named "label". And I think the folder is not the "label" in "convert_h5.py". So, how to start training?

Hi, for training using FreeSurfer segmentations you can use the segmentation file: mri/aseg.mgz which contains the segmentation of subcortical structures used in QuickNat, and the mri volume: mri/orig.mgz. In utils/preprocessor.py is a function remap_labels that shows which of the classes were used.

ai-med / quickNAT_pytorch

Out of Memory #31