Thanks for your code!
Could you please specify the computational resources used for your experiments? Specifically, what type and number of GPUs were employed?
If I only have one RTX 3090, can I finish the experiment?
As in this issue, we train our model on 4 A100-80G GPUs for less than 10 hours (batch size=16). Supposedly, with smaller batch size, the training is doable but might take longer time on single 3090, e.g., 3-5 days
Thanks for your code! Could you please specify the computational resources used for your experiments? Specifically, what type and number of GPUs were employed? If I only have one RTX 3090, can I finish the experiment?