Closed dariocf1 closed 2 years ago
Hi,
This error means that you do not have enough GPU memory to train with such batch size. Please, decrease the samples_per_gpu
value in the config file, say, make it samples_per_gpu=16
for example and look, if it works. If it does, you can increase a bit until you meet the same error. If it does not, decrease it even more (8, 6, 4, etc)
Thank you,
I already modified the model.py file, changing the samples_per_gpu, but it seems is not changed, still in 32, Where do I find the config file that is loaded for the training?
also I have changed the value in the file models/object_detection/model_templates/custom-object-detection/mobilenet_v2-2s_ssd-512x512/model.py and there is no change.
When you have instantiated the template, there should exist $WORK_DIR
, where you have train.py file. In the same folder there is model.py file, that is needed to be changed.
I already modified the model.py file, changing the samples_per_gpu, but it seems is not changed, still in 32, Where do I find the config file that is loaded for the training?
Have you changed the model.py that is in the template folder?
Yes, I have changed the value in the path and the file in my training folder, also I changed all the samples_per_gpu in the folder with the git clone. After change all, still running with 32
I have just instantiated template and tried to change batch size. I confirm the problem that it is not being changed, will investigate this.
Ok, I've got it. Instead of changing samples_per_gpu
in the model.py
, you need to change batch_size
in the template.yaml
. Sorry for confusing you.
also I changed all the samples_per_gpu in the folder with the git clone. After change all, still running with 32
No need to do it. Once the template is instantiated, all values are read from the WORK_DIR
, configs in the source repo have no effect.
Hi, I've trying to create a training with Custom object detector, but I got the following error:
RuntimeError: CUDA out of memory. Tried to allocate 768.00 MiB (GPU 0; 3.95 GiB total capacity; 2.61 GiB already allocated; 286.69 MiB free; 2.64 GiB reserved in total by PyTorch)