Closed Yeom closed 9 years ago
Thanks for the question. Yes, training should be run from the first-stage/TRAIN
directory with the trainParser
script. There's some information about what the script expects as arguments here:
https://github.com/BLLIP/bllip-parser/blob/master/first-stage/TRAIN/README.rst
For training data, it depends on what type of text you'd like to parse. Ideally, the training data would be similar in style to the text you'd like to parse. There are various treebanks available -- some are available for free, others can be licensed from the LDC.
First, i'm sorry that my english level(?) is so low I'm a student in South Korea And i wonder about how can i train the new data? Can i train the new corpus? i can't find how to train the data. Should i do in first-stage TRAIN directory? i'm so curious.