jbjorne / TEES

Turku Event Extraction System
146 stars 44 forks source link

How to train new models? #28

Open insanejokerhaha opened 6 years ago

insanejokerhaha commented 6 years ago

Hi,

In the wiki for Training, it mentioned the command for training models on new corpus: python train.py --trainFile MY_TRAINING_CORPUS.xml --develFile MY_DEVELOPMENT_CORPUS.xml --testFile MY_TEST_CORPUS.xml -o OUTDIR -c REMOTE I assume these input files with 'xml' extension should be in Interaction XML format. But when I read the content of GE11 corpus which is often used as an example in wiki, I find that there are only attributes of entity and interaction tags were set as event="True" without further information about what kind of event it is. Hence, I am wondering how can I use TEES to train models on new context? To be more specific, what kind of format of input should I use? If I want to extract the other types of self-defined event, shall I build my own event detector, example builder, etc?

It would be a really great help for a beginner like me if you could give me some instructions on this issue.Thank you for your precious time.

jbjorne commented 5 years ago

An event's type is defined by the 'type' attribute of its trigger node. For more information please see the documentation for the Interaction XML format at https://github.com/jbjorne/TEES/wiki/Interaction-XML .