stanfordnlp / stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
https://stanfordnlp.github.io/stanza/
Other
7.31k stars 896 forks source link

mwt training always breaks at epoch - 33 #445

Closed sarves closed 3 years ago

sarves commented 4 years ago

Describe the bug I tried to train Stanza for Tamil, and mwt training always (tried with different data set) breaks at 33rd epoch. Log: 2020-08-20 09:24:11: step 360/1100 (epoch 33/100), loss = 0.286464 (0.054 sec/batch), lr: 0.001000 Evaluating on dev set... epoch 33: train_loss = 0.271402, dev_score = 0.9390 Traceback (most recent call last): File "/usr/lib/python3.8/runpy.py", line 193, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.8/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/sarves/Stanza/stanza-train-master/stanza/stanza/models/mwt_expander.py", line 255, in main() File "/home/sarves/Stanza/stanza-train-master/stanza/stanza/models/mwt_expander.py", line 89, in main train(args) File "/home/sarves/Stanza/stanza-train-master/stanza/stanza/models/mwt_expander.py", line 182, in train trainer.change_lr(current_lr) AttributeError: 'Trainer' object has no attribute 'change_lr' Running MWT expander in predict mode Building an attentional Seq2Seq model... Using a Bi-LSTM encoder Using soft attention for LSTM. Finetune all embeddings. max_dec_len: 39 Loading data with batch size 50... 3 batches created. Running the seq2seq model... /pytorch/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.

AngledLuffa commented 4 years ago

Are you comfortable using branches? I think this should be fixed in the fix_mwt branch. Thanks for reporting.

https://github.com/stanfordnlp/stanza/pull/446

sarves commented 4 years ago

@AngledLuffa Thank you for fixing. So I can just replace new stanza/stanza/models/mwt/trainer.py from dev branch and try . Will do and revert back.