Open billfjj opened 1 year ago
I tried, but reported an error
This looks like a torch version problem. Different versions have different definitions between local_rank
and local-rank
. You could try one of these methods.
torchrun
to replace python -m torch.distributed.launch
This is my version of PyTorch, pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.7. But it still gives an error Is it because of Windows? Have you ever trained on Windows?
I fixed this issue by finding the correct compatible version of pytorch, found the CUDA version using nvcc --version and found the compatible version of PyTorch https://pytorch.org/get-started/previous-versions/
This is my version of PyTorch, pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.7. But it still gives an error Is it because of Windows? Have you ever trained on Windows?
I solved the problem by this solution: https://github.com/SysCV/sam-hq/issues/100#issuecomment-1903681099
hi, you can modify the nproc_per_node number from 8 to 1 here