microsoft / Swin-Transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
https://arxiv.org/abs/2103.14030
MIT License
13.98k stars 2.06k forks source link

Qs about Swin-moe #352

Open zws98 opened 7 months ago

zws98 commented 7 months ago

When I test the trained swin-moe on multiple gpus, the performance of each process is different. I loaded the weights for each process {t} with rank{t}.