nikitakit / self-attentive-parser

High-accuracy NLP parser with models for 11 languages.
https://parser.kitaev.io/
MIT License
861 stars 153 forks source link

Avoid downloading of nltk 'punkt' tokenizer #49

Open duichwer opened 4 years ago

duichwer commented 4 years ago

Shouldn't the parameter preserve_line=True being added to the call of nltk.word_tokenize since there should be only one sentence everytime?

https://github.com/nikitakit/self-attentive-parser/blob/1ee43a8f93d6f3259c09ea1ff57cf5124ec32efc/benepar/nltk_plugin.py#L89