dkpro / dkpro-core

Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.
https://dkpro.github.io/dkpro-core
Other
196 stars 67 forks source link

[TreeTagger] Problem processing very long tokens #30

Closed reckart closed 9 years ago

reckart commented 9 years ago
When there is a very very long token in the text, the analysis engine fails.

Original issue reported on code.google.com by richard.eckart on 2011-10-02 15:14:39

reckart commented 9 years ago
- Upgrade to TT4J 1.0.15 (which handles very long tokens)
- Added log-guard in TreeTaggerPosLemma to avoid unnecessary construction of TT4J status
string
- Added a test case for very long tokens
---
rev 109

Original issue reported on code.google.com by richard.eckart on 2011-10-02 15:15:07