Alibaba-NLP / CLNER

[ACL-IJCNLP 2021] Improving Named Entity Recognition by External Context Retrieving and Cooperative Learning
Other
91 stars 15 forks source link

Received error "UnboundLocalError: local variable 'loss' referenced before assignment" #5

Closed victorbai2 closed 3 years ago

victorbai2 commented 3 years ago

when I run "python train.py --config config/wnut17_doc.y", I got below error message

2021-09-30 09:31:34,791 Model training base path: "resources/taggers/xlmr-first_10epoch_2batch_2accumulate_0.000005lr_10000lrrate_eng_monolingual_crf_fast_norelearn_sentbatch_sentloss_finetune_nodev_wnut_doc_full_bertscore_eos_ner9"
2021-09-30 09:31:34,791 ----------------------------------------------------------------------------------------------------
2021-09-30 09:31:34,791 Device: cpu
2021-09-30 09:31:34,791 ----------------------------------------------------------------------------------------------------
2021-09-30 09:31:34,791 Embeddings storage mode: none
2021-09-30 09:31:35,288 ----------------------------------------------------------------------------------------------------
2021-09-30 09:31:35,292 Current loss interpolation: 1
['xlm-roberta-base']
Traceback (most recent call last):
  File "/pyt_pro/CLNER/flair/trainers/finetune_trainer.py", line 915, in train
    loss = self.model.forward_loss(student_input)
  File "/pyt_pro/CLNER/flair/models/sequence_tagger_model.py", line 1901, in forward_loss
    features = self.forward(data_points)
  File "/pyt_pro/CLNER/flair/models/sequence_tagger_model.py", line 1027, in forward
    self.mask=self.sequence_mask(torch.tensor(lengths),longest_token_sequence_in_batch).cuda().type_as(features)
  File "/root/miniconda3/envs/env3_pyt1.3.1/lib/python3.6/site-packages/torch/cuda/__init__.py", line 192, in _lazy_init
    _check_driver()
  File "/root/miniconda3/envs/env3_pyt1.3.1/lib/python3.6/site-packages/torch/cuda/__init__.py", line 102, in _check_driver
    http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
2021-09-30 09:31:35,633 [18, 18]
2021-09-30 09:31:35,633 [Sentence: "Is it cricket season ? I've killed about 20 in the laundry room in the past week ." - 18 Tokens, Sentence: "Owner says its cool . Trying to get info on when they'll be back in town now @AandLClothingCo" - 18 Tokens]
> <path>CLNER/flair/trainers/finetune_trainer.py(991)train()
-> if loss != 0:

Here it stopped in pdb mode, after I entered C to continue, it threw error "UnboundLocalError: local variable 'loss' referenced before assignment"

wangxinyu0922 commented 3 years ago

Hi,

See these lines in the log:

AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

It seems that you did not use GPU when runnning

victorbai2 commented 3 years ago

Hi,

See these lines in the log:

AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

It seems that you did not use GPU when runnning

Yes, I do not have GPU in local env, is there a config that I can use to train it with CPU?

wangxinyu0922 commented 3 years ago

The current version of the code only support GPU, but you may replace all .cuda() command to .to(flair.device).

victorbai2 commented 3 years ago

The current version of the code only support GPU, but you may replace all .cuda() command to .to(flair.device).

Thanks, after replace few .cuda() configs, I am able to train it with CPU. BTW, did you change the Flair original code? I wonder if I could install a different version of Flair library.

wangxinyu0922 commented 3 years ago

Yes, I have changed a lot of code in the original code of Flair.