Open parkjh688 opened 5 years ago
I found the reason of error.
When I print t.device
and self.src_device_obj
in torch data_parallel.py file.
I got cpu for t.device and cuda:0 for self.src_device_obj.
I guess the model made for CPU version. Can you tell me how to change CPU to GPU version?
The models are made for GPU actually. Which version of torch are you using? Are you sure that you are using GPU? You can check it as in https://stackoverflow.com/a/48152675/6400484
Yes I have and I checked it again by using that link.
It seems to be a pytorch bug, Please check this solution https://discuss.pytorch.org/t/bug-in-dataparallel-only-works-if-the-dataset-device-is-cuda-0/28634/18
Hi,
@ahmetgunduz Even I am facing the same issue. Is there any solution for this ?
I checked if torch is able to detect the cuda device (1 GPU in my case), It seems good. I am using the torch version 1.2.
I am using the following config just to try out for the offline test on jester.
#!/bin/bash
python offline_test.py \
--root_path ~/ \
--video_path /home/karthik/Desktop/Data/Jester/20bn-jester-v1 \
--annotation_path Desktop/Project/Real-time-GesRec/annotation_Jester/jester.json \
--result_path Desktop/Project/Real-time-GesRec/results \
--resume_path Desktop/Project/Real-time-GesRec/pre-trained-models/jester_resnext_101_RGB_32.pth \
--dataset jester \
--sample_duration 32 \
--learning_rate 0.01 \
--model resnext \
--model_depth 101 \
--batch_size 1 \
--n_classes 27 \
--n_finetune_classes 27 \
--modality RGB \
--n_threads 8 \
--checkpoint 1 \
--train_crop random \
--n_val_samples 1 \
--test_subset val \
--n_epochs 100
@parkjh688 were you able to solve the issue ?
Thanks in advance.
@Karthik-Bhaskar just to check can you please add --no_cuda
parameter as well if it is working with cpu.
Should I need to add any value for --no_cuda
parameter like True or False.
Or just include without any value like this,
#!/bin/bash
python offline_test.py \
--root_path ~/ \
--video_path /home/karthik/Desktop/Data/Jester/20bn-jester-v1 \
--annotation_path Desktop/Project/Real-time-GesRec/annotation_Jester/jester.json \
--result_path Desktop/Project/Real-time-GesRec/results \
--resume_path Desktop/Project/Real-time-GesRec/pre-trained-models/jester_resnext_101_RGB_32.pth \
--dataset jester \
--sample_duration 32 \
--learning_rate 0.01 \
--model resnext \
--model_depth 101 \
--batch_size 1 \
--n_classes 27 \
--n_finetune_classes 27 \
--modality RGB \
--n_threads 8 \
--checkpoint 1 \
--train_crop random \
--n_val_samples 1 \
--test_subset val \
--n_epochs 100 \
--no_cuda
I tried executing with the above parameters and ran into
RuntimeError: Error(s) in loading state_dict for ResNeXt
Please tell me if it's the wrong way to add that parameter.
Thanks.
Everything looks fine actually. The way you gave no_cuda
parameter is right.
Honestly, I have no clue about the error. It may be because of the torch version, the repo is lastly updated for PyTorch 1.0.1.post2 maybe you can downgrade your pytorch version and try.
I downgraded the PyTorch to 1.0.1.post2 but the issue remains the same. Can you please let me know if I need to use any particular version of the package or library. Currently, I am using Python 3.6 and Cuda 10.
python 3.7.3 and Cuda 10 is the current versions I am using. See below:
Dear @parkjh688 and @Karthik-Bhaskar, did you find any solution for this?
@ahmetgunduz Unfortunately not yet. I will try to run this code with other machine which has another cuda and cudnn version next week to check this problem whether cuda problem or not. But I guess this looks like cuda version problem.
@parkjh688 That is great! Looking forward to seeing the outcome...
model, parameters = generate_model(opt) model = model.cuda()
Add the sentence above.
@Karthik-Bhaskar were you able to solve the issue ? RuntimeError: Error(s) in loading state_dict for ResNeXt Thanks.
the codebase is updated. Could you please pull the repo and recheck ?
@MrXuf No, I could not resolve it. Recheck with updated codebase as @ahmetgunduz told above.
Oh!Thank you for your email. I had the same problem and it bothered me for a few days. I will recheck latest code.
------------------ 原始邮件 ------------------ 发件人: "Karthik-Bhaskar"<notifications@github.com>; 发送时间: 2020年5月23日(星期六) 晚上10:48 收件人: "ahmetgunduz/Real-time-GesRec"<Real-time-GesRec@noreply.github.com>; 抄送: "Mr_Xuf_qq_mail"<2640503128@qq.com>;"Mention"<mention@noreply.github.com>; 主题: Re: [ahmetgunduz/Real-time-GesRec] cuda gpu device Error (#33)
@MrXuf No, I could not resolve it. Recheck with updated codebase as @ahmetgunduz told above.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Hi.
I have 1 GPU in my computer but I got this error. I'm newbie of Pytorch so I don't know this Error's meaning.