Closed hg6185 closed 1 year ago
Hello!
total_batch_size
is the batch size among all the used GPUs, for example, if you set total_batch_size=2
and use two GPUs
, there will be one data on each GPU
.Does this still create a memory overhead because maybe the image are still saved on the GPU
I think it's better for you to print the input data size during training for debugging this, it will resize the input image to make sure the longest edge is not longer than 1333
The images that were provided are also in a Grayscale and jpg
I did not try some experiments on Grayscale
image, I will double check it, and the jpg
format is OK
Hi! Thanks for the quick response! I started logging the image sizes, the resizing works. I will see, if this helps me to resolve memory issues.
Hi! Thanks for the quick response! I started logging the image sizes, the resizing works. I will see, if this helps me to resolve memory issues.
Yes, you can also try to use gradient-checkpoint
to solve the memory issue, we've already supported this function in DINO
I will keep this in mind. Thank you very much!
Sofar, I helped myself by freezing the backbone. (https://arxiv.org/abs/2204.00484) I observed similar effects like Vasconcelos et al., while however it seems that especially small objects suffer from the lack of fine-tuning of the backbone.
I am currently training only on 1 GPU (Nvidia V100, unfortunately 16Gb VRAM) with Batch-Size 1. Unfortunately, my images are relatively large with ~2.5kx2k pixels. After ca 1200 iterations I encounter the following error:
RuntimeError: CUDA out of memory. Tried to allocate 74.00 MiB (GPU 0; 15.77 GiB total capacity; 13.90 GiB already allocated; 7.88 MiB free; 14.49 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
My Code to load the dataset:
I use this code (from the example) to load my dataset. Now I have two questions since I am not an expert on parallel computing:
I have a final question: The images that were provided are also in a Grayscale and jpg. From the source code I saw that when they are loaded by pillow they are changed TO the img_format that is set by default to "RGB". Do I have to change something here?
Thanks in Advance!