Open zws98 opened 7 months ago
When I test the trained swin-moe on multiple gpus, the performance of each process is different. I loaded the weights for each process {t} with rank{t}.
When I test the trained swin-moe on multiple gpus, the performance of each process is different. I loaded the weights for each process {t} with rank{t}.