jbjorne / TEES

Turku Event Extraction System
147 stars 44 forks source link

NER tagging for DDI task #26

Closed ashim95 closed 6 years ago

ashim95 commented 6 years ago

Hi,

First of all, Thank You for open-sourcing the code.

I am using your code to get NER tags of drug entities. So, I had written a python script for this. But the tagger is using BANNER (entity type = 'Protein'). Also I ran, it using the classify.py file and saw the intermediate files generated. But the tagger model used is still that of BANNER default model. Here is the command I ran:

python classify.py -m ~/.tees/models/DDI13T91-devel -i test_files/ddi_test_file_1.xml -o ddi_test

I could not find anything in the documentation that could help.

Please help. Thanks,

jbjorne commented 6 years ago

By default, the preprocessing done by classify.py will include BANNER, but it can be turned off. The following examples show how to use the DDI13T91-devel model for an example PubMed abstract, with preprocessing, but without the NER step:

If you are using the latest TEES development branch, you can omit BANNER by defining the preprocessing pipeline:

python classify.py -i 28850893 -m DDI13T91-devel -o /tmp/DDI13T91-PM28850893 --preprocessorParams LOAD,GENIA_SPLITTER,BLLIP_BIO,STANFORD_CONVERT,SPLIT_NAMES,FIND_HEADS,SAVE

If you are using the TEES master branch version, you can omit BANNER by omitting the NER preprocessing step:

python classify.py -i 28850893 -m DDI13T91-devel -o /tmp/DDI13T91-PM28850893 --omitSteps PREPROCESS=NER

ashim95 commented 6 years ago

It works. Thank You very much !!

Cheers