attardi / deepnl

Deep Learning for Natural Language Processing
GNU General Public License v3.0
457 stars 116 forks source link

Train POS tagger #34

Open dreamk73 opened 8 years ago

dreamk73 commented 8 years ago

From the limited documentation on dl-pos.py it is not clear exactly what should be in the vocabulary file. Is it the same text as the training data but in sentence format? Or the same as the vector file? And how large is the training data set typically?

Also can the code deal with word embeddings trained by other programs or do you have to create a new one with the provided deepNL code?