DFKI-NLP / TRE

[AKBC 19] Improving Relation Extraction by Pre-trained Language Representations
https://arxiv.org/abs/1906.03088
MIT License
108 stars 12 forks source link

UnboundLocalError for Semeval training #1

Closed BrambleXu closed 5 years ago

BrambleXu commented 5 years ago

The training command works well for the tacred dataset. But it didn't work for the semeval dataset.

I run the command below. datasets/semeval_jsonl stores the data.

python relation_extraction.py train \
  --write-model True \
  --masking-mode grammar_and_ner \
  --batch-size 8 \
  --max-epochs 3 \
  --lm-coef 0.5 \
  --learning-rate 5.25e-5 \
  --learning-rate-warmup 0.002 \
  --clf-pdrop 0.1 \
  --attn-pdrop 0.1 \
  --word-pdrop 0.0 \
  --dataset semeval_2010_task8 \
  --data-dir datasets/semeval_jsonl \
  --seed=0 \
  --log-dir ./logs/

The error shows:

Traceback (most recent call last):
  File "relation_extraction.py", line 453, in <module>
    'evaluate': evaluate
  File "/anaconda3/envs/py36/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
    component_trace = _Fire(component, args, context, name)
  File "/anaconda3/envs/py36/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
    component, remaining_args)
  File "/anaconda3/envs/py36/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
    result = fn(*varargs, **kwargs)
  File "relation_extraction.py", line 302, in train
    dev_file=dev_file)
  File "/Users/smap10/Project/RE-Task/TRE/datasets/semeval_2010_task8.py", line 113, in fetch
    SemEval2010Task8._load_from_jsonl(join(path_to_data, train_file), is_test=False, masking_mode=masking_mode)
  File "/Users/smap10/Project/RE-Task/TRE/datasets/semeval_2010_task8.py", line 88, in _load_from_jsonl
    example = SemEval2010Task8.apply_masking_mode(example, masking_mode)
  File "/Users/smap10/Project/RE-Task/TRE/datasets/semeval_2010_task8.py", line 259, in apply_masking_mode
    first_entity_replace, second_entity_replace = [f'{g}-{n}' for g, n in zip(grammar_type, ner_type)]
UnboundLocalError: local variable 'grammar_type' referenced before assignment
marchbnr commented 5 years ago

Hi, thank you for the interest in our paper and the code.

To get the training running on semeval you have to remove the masking mode, since the dataset does not provide NER tags. The same applies to the evaluation command.

Try running the training on semeval like this:

python relation_extraction.py train \  
  --write-model True \
  --batch-size 8 \
  --max-epochs 3 \
  --lm-coef 0.7 \
  --learning-rate 6.25e-5 \
  --learning-rate-warmup 1e-3 \
  --clf-pdrop 0.1 \
  --attn-pdrop 0.15 \
  --dataset semeval_2010_task8 \
  --data-dir datasets/semeval_jsonl \
  --seed=0 \
  --log-dir ./logs/
BrambleXu commented 5 years ago

Thanks for the replay. After deleting masking mode It works well now. But the running command you paste should separate the seed and log-dir option. Otherwise, there will be an error. Please correct it for users who see this issue.