Open shyakocat opened 10 months ago
Related Issue
training with --gpus 0, --gpus 0,1 hangs at initializing ddp ...
--gpus 0,
--gpus 0,1
I've tried, change ddp to dp, change nccl to gloo, upgrade lightning or torch version... still not work.
ddp
dp
nccl
gloo
(only cpu works)
@shyakocat I encountered the same issue, have you resolved it?
try running export NCCL_P2P_DISABLE=1 before running the script it worked for me
export NCCL_P2P_DISABLE=1
Related Issue
training with
--gpus 0,
--gpus 0,1
hangs at initializing ddp ...I've tried, change
ddp
todp
, changenccl
togloo
, upgrade lightning or torch version... still not work.(only cpu works)