There is nothing wrong with the TreeTagger, but it does increase complexity of installation and maintenance and probably comes with a small speed performance hit.
Use spaCy instead. This will bring up some questions on tokenization and pluggability. For full pluggability we could still keep the option to insert the TreeTagger and/or the tokenizer and therefore add spaCy as an extra alternative (probably the default). I am leaning against having to do any work to keep non-python components around, but it may be good to have a built-in way for people to add custom components (this should be a separate issue though).
There is nothing wrong with the TreeTagger, but it does increase complexity of installation and maintenance and probably comes with a small speed performance hit.
Use spaCy instead. This will bring up some questions on tokenization and pluggability. For full pluggability we could still keep the option to insert the TreeTagger and/or the tokenizer and therefore add spaCy as an extra alternative (probably the default). I am leaning against having to do any work to keep non-python components around, but it may be good to have a built-in way for people to add custom components (this should be a separate issue though).