Open alejoGT1202 opened 2 years ago
Hi, you can change line:
https://github.com/WongKinYiu/yolor/blob/be7da6eba2f612a15bf462951d3cdde66755a180/train.py#L219
and line:
https://github.com/WongKinYiu/yolor/blob/be7da6eba2f612a15bf462951d3cdde66755a180/train.py#L361
not sure why the batch size is doubled during validation, but that solved the issue for me.
I'm training on an EC2 instance with T4 GPU and 16GB of memory.
I'm using a batch size of 2 and image size of 960, however after 3 epochs the script is killed because GPU is out of memory. How can I overcome this without reducing my batch size to 1?
Thanks for the help.