Closed 8596858 closed 5 months ago
Hi, @8596858 We have both tried on 24GB GeForce RTX 3090, 48GB RTX A6000 and 80GB A100. The only difference would be the batch size and gradient accumulation steps, which means that our model can be trained within all GPUs that have at least 24GB VRAM. Good luck!
Hello, we are very interested in your project and we would like to try training with your code. What type of GPU did you use?
Thank you.