CalvinYang0 / CRNet

31 stars 6 forks source link

CUDA memory error occurred during model.test() #7

Open CharisWg opened 6 days ago

CharisWg commented 6 days ago

When running test.py, I encountered the following error: RuntimeError: CUDA out of memory. Tried to allocate 4.45 GiB (GPU 0; 10.74 GiB total capacity; 7.17 GiB already allocated; 2.05 GiB free; 7.20 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF.

I have successfully train the model before, so I'm wondering why this CUDA memory error occurred during model.test() process, Is this normal?

CalvinYang0 commented 5 days ago

This phenomenon is quite normal. You can resolve it by reducing the batch size of the test. If the issue persists even when the batch size is set to 1, you can first split the test data into appropriate patches and then reassemble them (which may result in some performance loss).

CharisWg commented 4 days ago

Can you inform me how to set the batch size to 1 in this test.py, and how to first split the test data into appropriate patches and then reassemble them?

python test.py

--dataset_name bracketire --model cat --name "coca" --dataroot "./data/Bracketire/" --load_iter 500 --save_imgs True --calc_metrics False --gpu_id "0" -j 8 --block Convnext

CharisWg commented 4 days ago

Can I use only JPG as the input? Or can you suggest what type of image is needed for the input? Do I need to create raw_1/4/16/64/256.npy files?

CalvinYang0 commented 3 days ago

If you haven’t made any specific changes, the default batch size is set to 1 when creating the test set. You only need to add an extra loop during inference, segmenting the test set into multiple patches for inference. Any remaining part that is not enough to form a full patch can be padded with zeros. After looping, you can reassemble the segments in order. This method may result in some performance loss. If you haven’t resolved this issue yet, I can organize the code from another project for your reference.

CalvinYang0 commented 3 days ago

For the original code, raw image input is essential. However, if you want to apply it to other tasks or datasets, using other image formats is certainly possible.

CharisWg commented 1 day ago

Thank you for your reply. I noticed that each training dataset includes files named raw(1,4,16,64,256).npy. My question is: what is the purpose of these files, and what are the differences between them? For my own dataset, do I need to create raw(1,4,16,64,256).npy files for each image? Additionally, what is raw_gt? Is it the original image? I am not sure what these different raw.npy files are for, so I am unsure how to create them for my own dataset.

CalvinYang0 commented 22 hours ago

https://github.com/cszhilu1998/BracketIRE For specific information about the data, please refer to the challenge description. I wonder if you are using PNG, TIF, or JPG images. I believe you only need to format the network input as (B, 3, H, W). If there are already related works with your dataset, you just need to replace the model with CRNet. What is your preferred social media platform? If you encounter any difficulties, we might be able to communicate through other means.