SysCV / sam-hq

Segment Anything in High Quality [NeurIPS 2023]
https://arxiv.org/abs/2306.01567
Apache License 2.0
3.55k stars 210 forks source link

How do I run the training script if I only have one graphics card? #58

Open billfjj opened 11 months ago

lkeab commented 11 months ago

hi, you can modify the nproc_per_node number from 8 to 1 here

billfjj commented 11 months ago

I tried, but reported an error b62572fc497f1d8b15f3be2dabf5880

ymq2017 commented 11 months ago

This looks like a torch version problem. Different versions have different definitions between local_rank and local-rank. You could try one of these methods.

  1. Use a lower version of PyTorch
  2. Use torchrunto replace python -m torch.distributed.launch
billfjj commented 11 months ago

This is my version of PyTorch, pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.7. But it still gives an error image Is it because of Windows? Have you ever trained on Windows?

vishakhalall commented 11 months ago

I fixed this issue by finding the correct compatible version of pytorch, found the CUDA version using nvcc --version and found the compatible version of PyTorch https://pytorch.org/get-started/previous-versions/

halqadasi commented 5 months ago

This is my version of PyTorch, pytorch==1.13.0 torchvision==0.14.0 torchaudio==0.13.0 pytorch-cuda=11.7. But it still gives an error image Is it because of Windows? Have you ever trained on Windows?

I solved the problem by this solution: https://github.com/SysCV/sam-hq/issues/100#issuecomment-1903681099