Open Wyzix33 opened 1 year ago
small update, if i set
strategy="ddp"
to strategy="dp"
it's working, but slower ... this is what i get using dp and 2 gpus:
Epoch 0: 3%|████▉ | 60/2005 [03:34<1:55:54, 3.58s/it, loss=29.7, v_num=base]
with ddp and one gpu i get
Epoch 0: 1%|█▊ | 44/4010 [00:40<1:00:38, 1.09it/s, loss=29, v_num=base]
Hi, I just started playing around with donut and wanted to pretrain a new language, I have 3 AMD 6900 XT gpus. I am able to run the trainer with one GPU
, but if i try to run it using 2 or 3 i get error using this config:
throws this:
and using this confing:
i get
Do i need to set something different when using multiple GPUs or is this some rocm problem? Any help pls ...