tema16 / tt4j

Automatically exported from code.google.com/p/tt4j
0 stars 0 forks source link

Try to detect when communication with TreeTagger process looses sync #4

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Sometimes an odd character appears in a token and causes the communication 
stream with the TreeTagger to be out of sync. That means we expect to see a 
token, but get another token. There should be some way for TT4J to detect this 
and bail out.

Original issue reported on code.google.com by richard.eckart on 1 Jun 2011 at 5:42

GoogleCodeExporter commented 9 years ago
Added a 10 token ring-buffer to be better able to see what caused TT4J to loose 
sync.
Added a method to check if the token returned from TreeTagger matches the one 
sent. Note that exact matching does not work because TreeTagger may not 
recognize some characters.

Original comment by richard.eckart on 1 Jun 2011 at 5:44