I'm getting
RuntimeError: DDP expects same model across all ranks, but Rank 0 has 686 params, while rank 1 has inconsistent 0 params.
while trying to train the model. I'm using 8 A100 GPUs as recommended with and batch-size 64 in stage one. I've not been able to start the train_hack code past unet = DDP(unet, device_ids=[local_rank], output_device=local_rank) line.
I'm getting
RuntimeError: DDP expects same model across all ranks, but Rank 0 has 686 params, while rank 1 has inconsistent 0 params.
while trying to train the model. I'm using 8 A100 GPUs as recommended with and batch-size 64 in stage one. I've not been able to start the train_hack code pastunet = DDP(unet, device_ids=[local_rank], output_device=local_rank)
line.