Je vous présente mes excuses de vous écrire en anglais, mais ma
grammaire française est horrible. Néanmoins, j'arrive assez bien à la
lecture, alors n'hésitez pas à répondre en français si vous voulez :)
I have encountered some bugs while using Collatinus for OSX 11.1 full.
I have been using the TCP server with a custom python wrapper with the
statistical tagger. Overall, it works very well, and I have tagged
~1.8million sentences. However, certain words cause the server to go
into what looks like an infinite loop (100% CPU utilisation, does not
respond correctly to further tagging requests).
Based on experimentation, I think the main issues are with
abbreviations. Here is the list of words I have discovered so far:
Sorry for the delay !
Yes, there is an endless loop when a sentence ends with an abbreviation.
It should be patched in the next version 11.3, which is about to be published.
I'll let you know when we publish it.
[copy also sent via email]
Hello,
Je vous présente mes excuses de vous écrire en anglais, mais ma grammaire française est horrible. Néanmoins, j'arrive assez bien à la lecture, alors n'hésitez pas à répondre en français si vous voulez :)
I have encountered some bugs while using Collatinus for OSX 11.1 full. I have been using the TCP server with a custom python wrapper with the statistical tagger. Overall, it works very well, and I have tagged ~1.8million sentences. However, certain words cause the server to go into what looks like an infinite loop (100% CPU utilisation, does not respond correctly to further tagging requests).
Based on experimentation, I think the main issues are with abbreviations. Here is the list of words I have discovered so far:
Cn, Sex, Post, Pro, Cap, Ser, Oct, Ap, Kal, Tib, St, Pl
You should be able to replicate the issue by sending a remote tag request with the client. eg:
/Applications/Collatinus_11.1.app/Contents/MacOS/Client_C11 -P3 "Ap"
Please let me know if you would like any more information. I'd be happy to test any updated builds on my dataset.
Thankyou for the software!