DDP expects same model across all ranks

zhengchen1999 / DAT

PyTorch code for our ICCV 2023 paper "Dual Aggregation Transformer for Image Super-Resolution"

Apache License 2.0

386 stars 37 forks source link

Hi,

I am trying to train DAT model on my custom dataset and have made all the required changes in .yml file. I have added the data in the designated directories but when i give it the command to start training it spends quite a lot of time once the following message is displayed: INFO: Network [DAT] is created.

Then I get the following error message and training fails. Kindly let me know how can I fix this issue. I am running the model in inference mode with pretrained models on my custom data and it works perfectly.

PS: I am training on 4 3090 GPUs.

zhengchen1999 / DAT

DDP expects same model across all ranks #29