Hi, thanks for the great work. I am trying to reproduce your work but unable to train the model with 6 V100 with 32 GBs. Did you use 80GB A100 for the training. Also, in the code neither deepspeed nor FSDP is used, how should I train and reproduce your results with my setup. Best wishes.
Hi, thanks for the great work. I am trying to reproduce your work but unable to train the model with 6 V100 with 32 GBs. Did you use 80GB A100 for the training. Also, in the code neither deepspeed nor FSDP is used, how should I train and reproduce your results with my setup. Best wishes.