reckart / tt4j

TreeTagger for Java
http://reckart.github.io/tt4j/
Apache License 2.0
16 stars 7 forks source link

Try to detect when communication with TreeTagger process looses sync #4

Closed reckart closed 9 years ago

reckart commented 9 years ago

Original issue 4 created by reckart on 2011-06-01T05:42:00.000Z:

Sometimes an odd character appears in a token and causes the communication stream with the TreeTagger to be out of sync. That means we expect to see a token, but get another token. There should be some way for TT4J to detect this and bail out.

reckart commented 9 years ago

Comment #1 originally posted by reckart on 2011-06-01T05:44:17.000Z:

Added a 10 token ring-buffer to be better able to see what caused TT4J to loose sync. Added a method to check if the token returned from TreeTagger matches the one sent. Note that exact matching does not work because TreeTagger may not recognize some characters.