Open CleanSeaSalt opened 4 months ago
I have the same question.
I have the same question.
On 3090, the batchsize was set to 2, and then gradient accumulation was turned on. It took nearly a week to train, but the indicators were obviously not as good as those with multiple GPUs.
Thanks for your experience. I'm a learner. I didn't realize it need take so long to train on a single GPU without ideal effect.
The reason why the training time is so long is because of the use of CBGS, which expands the training set from 2.8W to 12W. You can try turning off CBGS.
Thanks for your experience. I'm a learner. I didn't realize it need take so long to train on a single GPU without ideal effect.
Thank you very much for your excellent work. I only have one RTX3090. How should I modify the configuration file to achieve the model trained by you on multiple cards?