CompVis / taming-transformers

Taming Transformers for High-Resolution Image Synthesis
https://arxiv.org/abs/2012.09841
MIT License
5.83k stars 1.15k forks source link

gpu training hangs #235

Open shyakocat opened 10 months ago

shyakocat commented 10 months ago

Related Issue

training with --gpus 0, --gpus 0,1 hangs at initializing ddp ...

I've tried, change ddp to dp, change nccl to gloo, upgrade lightning or torch version... still not work.

(only cpu works)

hllion commented 10 months ago

@shyakocat I encountered the same issue, have you resolved it?

froestiago commented 8 months ago

try running export NCCL_P2P_DISABLE=1 before running the script it worked for me