Open imaklex5 opened 4 months ago
Hello, I also encountered this problem, I would like to ask if you have solved this problem
Set num_gpu
in yml as 1.
And use the following script:
python basicsr/train.py -opt options/Train/train_DAT_light_x2.yml
Thanks for the author's immediate reply, but now I have a new question, I am using a single 2080Ti run and have set both batch_size and num_worker to 1.
It may be that the cuda and pytorch versions do not match. Re-build a new python environment use the following scripts:
conda create -n DAT python=3.8
conda activate DAT
pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio===0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt (delete torch==1.8.0 and torchvision in requirements.txt)
python setup.py develop
Hello, I also encountered this problem, I would like to ask if you have solved this problem
I just add this lines in basicsr.utilsdist_util._init_dist_pytorch,and use command the author mentioned before then it works.
os.environ['RANK'] = '0'
os.environ['WORLD_SIZE'] = '1'
os.environ['MASTER_ADDR'] = '127.0.0.1'
os.environ['MASTER_PORT'] = '1234'
It may be that the cuda and pytorch versions do not match. Re-build a new python environment use the following scripts:
conda create -n DAT python=3.8 conda activate DAT pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio===0.8.0 -f https://download.pytorch.org/whl/torch_stable.html pip install -r requirements.txt (delete torch==1.8.0 and torchvision in requirements.txt) python setup.py develop
thx very much^_^
Hi!
I do as the the reply said to train the DAT-light model on single 3090 GPU,but got KeyError: 'RANK' as follows. How can I fix this?