Closed kajc10 closed 8 months ago
Hi @kajc10 Do you mind sharing your yaml or docker file for this? I need to train the model on a custom dataset and I'm having big trouble trying to adjust dependency versions, specially regarding pytorch-lightining. Thanks!
Hi @kajc10 , I faced same issue and how do you solve that.
Hi @Han1018 I was able to make it work without any docker file, only using conda envs
After running conda env create -f environment.yaml; conda activate taming
I uninstalled pytorch (torch and torchvision)
Then I installed the 1.8.1 + cu111 version pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
After that uninstalled pillow and reinstalled using pip install pillow==9.5.0
.
I do not remember very well but you might come across an error regarding torch._six
, if you do, replace from torch._six import string_classes
with string_classes = str
(reference)
Hope this helps 🤗
Hi @froestiago, Thank you sososo much. It works for me and helps me save a lot of time !!!
Hi @Han1018 I was able to make it work without any docker file, only using conda envs
After running
conda env create -f environment.yaml; conda activate taming
I uninstalled pytorch (torch and torchvision) Then I installed the 1.8.1 + cu111 versionpip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
After that uninstalled pillow and reinstalled usingpip install pillow==9.5.0
. I do not remember very well but you might come across an error regardingtorch._six
, if you do, replacefrom torch._six import string_classes
withstring_classes = str
(reference)Hope this helps 🤗
Somehow resolves my problem about the hanging training process. Thank you!!
I have CUDA Version 12.3 and therefore the given pytorch configurations will not work. I tried to adjust dependency versions, but could not create a working config setup. Could you help me out?
With the current environment.yaml I get stuck at 'initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/1'
EDIT: solved by installing latest torch torchvision torchaudio and pillow==8.4.0