edvisees / opera-TA1-ner-bert

English NER using BERT
0 stars 1 forks source link

multiple opera-TA1-ner-bert errors #1

Closed zaidsheikh closed 4 years ago

zaidsheikh commented 4 years ago
running corenlp for /data2/zaid/OPERA_data/tmp/zhisongz_output0715plain/ltf_lang_corrected/eng/JC002YGFT.ltf.xml ...
finished corenlp for /data2/zaid/OPERA_data/tmp/zhisongz_output0715plain/ltf_lang_corrected/eng/JC002YGFT.ltf.xml
Traceback (most recent call last):
  File "main.py", line 353, in <module>
    main()
  File "main.py", line 343, in main
    success = run_document(os.path.join(input_dir, file), nlp, ontology, decisions, out_fname=os.path.join(output_dir, file + '.json'))
  File "main.py", line 83, in run_document
    named_ents, ners, feats = extract_ner(sent)
  File "/data2/zaid/OPERA/opera-TA1-text-pipeline/xianyang/code_ner_bert/ner.py", line 407, in extract_ner
    if ner['type'] != 'TTL' and subtype in SUBTYPE_HIERARCHY[ner['type']]:
KeyError: 'LAW'
ERROR: Exception occurred while processing /data2/zaid/OPERA_data/tmp/zhisongz_output0715plain/tmp/ltf_lang_corrected/eng/JC002YGFT.ltf.xml, sentence: Patrick:   In 1972, just as they signed the Biological Convention, the Soviet Union expanded their program.
Traceback (most recent call last):
  File "main.py", line 85, in run_document
    named_ents, ners, feats = extract_ner(sent)
  File "/data2/zaid/OPERA/opera-TA1-text-pipeline/xianyang/code_ner_bert/ner.py", line 407, in extract_ner
    if ner['type'] != 'TTL' and subtype in SUBTYPE_HIERARCHY[ner['type']]:
KeyError: 'LAW'
processing /data2/zaid/OPERA_data/tmp/zhisongz_output0715plain/en_50/ltf_lang_corrected/eng/JC002YEOJ.ltf.xmlERROR: Exception occurred while processing /data2/zaid/OPERA_data/tmp/zhisongz_output0715plain/en_50/ltf_lang_corrected/eng/KC003AE1L.ltf.xml, sentence: He called the pregnancy "the best news I’ve received in the last 3 1/2 years" — the time he spent behind bars before being released to house arrest last month.
Traceback (most recent call last):
  File "main.py", line 85, in run_document
    named_ents, ners, feats = extract_ner(sent)
  File "/data2/zaid/OPERA/opera-TA1-text-pipeline/xianyang/code_ner_bert/ner.py", line 284, in extract_ner
    ners, ner_probs = mod.pred_ner(sent)
  File "/data2/zaid/OPERA/opera-TA1-text-pipeline/xianyang/code_ner_bert/pytorch-pretrained-bert/examples/run_ner.py", line 109, in pred_ner
    eval_examples, label_list, 300, tokenizer)
  File "/data2/zaid/OPERA/opera-TA1-text-pipeline/xianyang/code_ner_bert/pytorch-pretrained-bert/examples/run_ner.py", line 243, in convert_examples_to_features
    labels_a = tokenize_label(example.text_a, tokens_a, example.label)
  File "/data2/zaid/OPERA/opera-TA1-text-pipeline/xianyang/code_ner_bert/pytorch-pretrained-bert/examples/run_ner.py", line 213, in tokenize_label
    token_label.append(label[idx])
IndexError: list index out of range
07/24/2020 06:26:33 - INFO - root -   Cleanup...
07/24/2020 06:26:33 - INFO - root -   Killing pid: 6773, cmdline: ['java', '-Xmx8g', '-cp', '/data2/zaid/OPERA/opera-TA1-text-pipeline/xianyang/code_ner_bert/stanford-corenlp-full-2017-06-09/*', 'edu.stanford.nlp.pipeline.StanfordCoreNLPServer', '-port', '9000', '-timeout', '120000']
07/24/2020 06:26:33 - INFO - root -   Killing shell pid: 6772, cmdline: ['/bin/sh', '-c', 'java -Xmx8g -cp "/data2/zaid/OPERA/opera-TA1-text-pipeline/xianyang/code_ner_bert/stanford-corenlp-full-2017-06-09/*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 120000']
running corenlp for /data2/zaid/OPERA_data/tmp/zhisongz_output0715plain/en_50/ltf_lang_corrected/eng/JC002YEOJ.ltf.xml ...
finished corenlp for /data2/zaid/OPERA_data/tmp/zhisongz_output0715plain/en_50/ltf_lang_corrected/eng/JC002YEOJ.ltf.xml
zaidsheikh commented 4 years ago

fixed by 742c78da7cfef4a0b245b8b20bcf160f3693e540