Open JannikStroetgen opened 7 years ago
In noticed this in my unit tests, too.
But I don't think sentence splitting is Heideltimes responsibility (except if you use the "NO" tagger).
Stanford tagger seems to get this right, I don't know about TreeTagger. Since my use case is using Stanford anyway, I did not bother looking into allowing the matching cross sentence boundaries.
We actually manipulate the pos output of the treetagger for a couple of languages to get rid of incorrect sentence boundaries. I will address this soon.
JAN. 27, 2017 is a date. two sentences extracted avoid the matching of JAN. 27, 2017 as temporal expression: