Open Rishav-hub opened 9 months ago
Hi Rishav, sorry I'm not an expert but a first time commenter XD, have you tried running it with AMP? Adding the --amp parameter to train with automatic mixed precision can speed up the training time severalfold.
Additionally try experimenting with different batch sizes and see what the eta is after a couple of epochs. If the batch size is too high and your GPU doesn't have enough memory it leads to memory overflow which significantly increases training time. I myself find using a batch size of 6 on my laptop's RTX 4060 as 8 is too much for it to handle.
So I have a dataset comprising of 1900 images in total and having 55 classes. I am experimenting with RTMDet tiny, medium and large models. But when I start training it shows ETA as 4 days for 700 epochs. The issue is model is not converging well if I train for lesser epochs.
this is how my config file looks like -:
So I need some suggestions from the experts in the community for either changing my config file or any other suggestion for improvement.