grammarly / gector

Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)
Apache License 2.0
894 stars 216 forks source link

APPEND won't happen in some cases #109

Closed linhkid closed 3 years ago

linhkid commented 3 years ago

Let say, for example, my corrupt sentence is

"A B C"

I would like to replace "A" with "D E"

But the result is: "D B C", it should be "D E B C"

Checked my .m2 training data, it has them all there but when I tried to predict, the other token "E" is always gone.

I checked during the inference (the variable "sugg_token" and there is no tokens or actions for the word "E")

What could be the reason? I can fix it myself but it might take a long time though. Appreciate any helps!

skurzhanskyi commented 3 years ago

Yes, that's true. This is because of the limitation of our architecture. During 1 iteration, we can predict only 1 action per token. That's why we remove other tags during preprocessing. I would suggest splitting this example into two.

linhkid commented 3 years ago

Ok thanks, or maybe I can just add an underline between them, then remove in postprocess

skurzhanskyi commented 3 years ago

I'm not sure if this is a good solution. Such a tag will be very rare and won't have enough examples (if I understand the nature of your task).