VIPL-SLP / VAC_CSLR

Visual Alignment Constraint for Continuous Sign Language Recognition. ( ICCV 2021)
https://openaccess.thecvf.com/content/ICCV2021/html/Min_Visual_Alignment_Constraint_for_Continuous_Sign_Language_Recognition_ICCV_2021_paper.html
Apache License 2.0
116 stars 19 forks source link

Question about CPU or GPU error #9

Open chunguangqu opened 2 years ago

chunguangqu commented 2 years ago

I ran your code and found the following error, where are the parameters put into the GPU?

Traceback (most recent call last): File "main.py", line 218, in processor.start() File "main.py", line 46, in start seq_train(self.data_loader['train'], self.model, self.optimizer,self.device, epoch, self.recoder) File "/home/quchunguang/sunday/CSLR/seq_scripts.py", line 24, in seq_train loss = model.criterion_calculation(ret_dict, label, label_lgt) File "/home/quchunguang/sunday/CSLR/slr_network.py", line 96, in criterion_calculation label_lgt.cpu().int()).mean() File "/home/quchunguang/anaconda3/envs/tf/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/home/quchunguang/anaconda3/envs/tf/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 1295, in forward self.zero_infinity) File "/home/quchunguang/anaconda3/envs/tf/lib/python3.6/site-packages/torch/nn/functional.py", line 1767, in ctc_loss zero_infinity) RuntimeError: Tensor for argument #2 'targets' is on CPU, but expected it to be on GPU (while checking arguments for ctc_loss_gpu)

ycmin95 commented 2 years ago

I haven't met this problem before. It seems like it is about switches between native and cudnn in earlier discussion about pytorch discussion

The recent version of pytorch adopt two different ways for different backends: For native: checkAllSameGPU(c, {log_probs_arg, targets_arg}); For cudnn: checkBackend(c, {*log_probs}, Backend::CUDA); checkBackend(c, {*targets}, Backend::CPU);

You can debug based on your pytorch version and the backend used (torch.backends.cudnn.enabled).

chunguangqu commented 2 years ago

what is your pytorch version?Thanks

ycmin95 commented 2 years ago

1.10.0 and 1.10.1 works for me.

chunguangqu commented 2 years ago

My pytorch is also 1.10.0,What is your ctcdecode version? When I run main.py, the following error occurs: (your) (base) quchunguang@ubuntu:~/sunday/CSLR$ python main.py Loading model Traceback (most recent call last): File "main.py", line 207, in processor = Processor(args) File "main.py", line 33, in init self.model, self.optimizer = self.loading() File "main.py", line 96, in loading loss_weights=self.arg.loss_weights, File "/home/quchunguang/sunday/CSLR/slr_network.py", line 38, in init self.decoder = utils.Decode(gloss_dict, num_classes, 'beam') File "/home/quchunguang/sunday/CSLR/utils/decode.py", line 19, in init self.ctc_decoder = ctcdecode.CTCBeamDecoder(vocab, beam_width=10, blank_id=blank_id, AttributeError: module 'ctcdecode' has no attribute 'CTCBeamDecoder'

ycmin95 commented 2 years ago

The version has been presented in Readme, it seems like you did not install ctcdecoder successfully.