guillaumegenthial / sequence_tagging

Named Entity Recognition (LSTM + CRF) - Tensorflow
https://guillaumegenthial.github.io/sequence-tagging-with-tensorflow.html
Apache License 2.0
1.94k stars 703 forks source link

recipe for target 'run' failed #71

Closed shanalikhan closed 4 years ago

shanalikhan commented 5 years ago

Im getting the following error on make run

(nltk) shan@shan-HP-ZBook-15:~/Accelirate/sequence_tagging$ make run
python build_data.py
Building vocab...
- done. 20312 tokens
Building vocab...
- done. 400000 tokens
Writing vocab...
- done. 17789 tokens
Writing vocab...
- done. 45 tokens
Writing vocab...
- done. 84 tokens
python train.py
/home/shan/python-virtual-environments/nltk/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py:108: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Initializing tf session
2018-10-10 15:13:20.715735: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Epoch 1 out of 15
702/702 [==============================] - 453s - train loss: 9.3642
Traceback (most recent call last):
  File "train.py", line 26, in <module>
    main()
  File "train.py", line 23, in main
    model.train(train, dev)
  File "/home/shan/Accelirate/sequence_tagging/model/base_model.py", line 121, in train
    score = self.run_epoch(train, dev, epoch)
  File "/home/shan/Accelirate/sequence_tagging/model/ner_model.py", line 295, in run_epoch
    metrics = self.run_evaluate(dev)
  File "/home/shan/Accelirate/sequence_tagging/model/ner_model.py", line 324, in run_evaluate
    lab_chunks      = set(get_chunks(lab, self.config.vocab_tags))
  File "/home/shan/Accelirate/sequence_tagging/model/data_utils.py", line 398, in get_chunks
    default = tags[NONE]
KeyError: 'O'
makefile:7: recipe for target 'run' failed
make: *** [run] Error 1
liesun1994 commented 5 years ago

I had the same issue as you mentioned above.

maxindian commented 5 years ago

I had the same issue as you mentioned above

Zqy11 commented 5 years ago

The original format of CoNLL-2003 are as follows: Indian NNP I-NP I-MISC all-rounder NN I-NP O Phil NNP I-NP I-PER

The tag is in the fourth column, so you need to modify 'model/data_utils.py', in Line 75: word, tag = ls[0],ls[1]------->word, tag = ls[0],ls[3]

guillaumegenthial commented 5 years ago

This is an out-of-vocabulary issue (some tags are not in the vocab).