Closed idonahum1 closed 1 year ago
Hi, Thanks for the great work. I would like to ask if you tried to train a DETR DDQ model on a single GPU, since I get CUDA out of memory error when using batch_size = 2 , which seems a little bit weird, since in your implementation you used 8 GPU's X 2 samples per GPU, and my setting is the same for single GPU. am I doing something wrong? my GPU is Tesla V100 with 16GB
Thanks.
This is indeed a bit strange, because the log should be about 11G, but you can try to reduce the number of queries, for example, from 900 to 500, and change the aux query ratio to 1, https://github.com/jshilong/DDQ/blob/a166d18658b6b5b57621c00d6aa04e52a80e65bd/projects/models/ddq_detr.py#L220 which will not cause significant performance loss
Feel free to reopen the issue if there is any question
I have a similar problem, I use ddq-detr-5scale_r50_8xb2-12e_coco.py, when the batch size is only 1, the memory occupation on 3090Ti is about 20GB, is there any way to reduce the memory occupation during training? Thanks a lot.
I have a similar problem, I use ddq-detr-5scale_r50_8xb2-12e_coco.py, when the batch size is only 1, the memory occupation on 3090Ti is about 20GB, is there any way to reduce the memory occupation during training? Thanks a lot.
You can try to reduce the number of queries, for example, from 900 to 500, and change the aux query ratio to 1, https://github.com/jshilong/DDQ/blob/a166d18658b6b5b57621c00d6aa04e52a80e65bd/projects/models/ddq_detr.py#L220 which will not cause significant performance loss Besides it, the 4-scale config is strong enough in most cases
I have a similar problem, I use ddq-detr-5scale_r50_8xb2-12e_coco.py, when the batch size is only 1, the memory occupation on 3090Ti is about 20GB, is there any way to reduce the memory occupation during training? Thanks a lot.
You can try to reduce the number of queries, for example, from 900 to 500, and change the aux query ratio to 1, https://github.com/jshilong/DDQ/blob/a166d18658b6b5b57621c00d6aa04e52a80e65bd/projects/models/ddq_detr.py#L220 which will not cause significant performance loss Besides this, the 4-scale config is strong enough in most cases
I have a similar problem, I use ddq-detr-5scale_r50_8xb2-12e_coco.py, when the batch size is only 1, the memory occupation on 3090Ti is about 20GB, is there any way to reduce the memory occupation during training? Thanks a lot.
You can try to reduce the number of queries, for example, from 900 to 500, and change the aux query ratio to 1,
which will not cause significant performance loss Besides this, the 4-scale config is strong enough in most cases
Thanks for the quick response. If I lower the number of queries, does that mean I can't use the weights you provide for finetune?
Hi, Thanks for the great work. I would like to ask if you tried to train a DETR DDQ model on a single GPU, since I get CUDA out of memory error when using batch_size = 2 , which seems a little bit weird, since in your implementation you used 8 GPU's X 2 samples per GPU, and my setting is the same for single GPU. am I doing something wrong? my GPU is Tesla V100 with 16GB
Thanks.