grammarly / gector

Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)
Apache License 2.0
891 stars 216 forks source link

preprocessing data question #178

Open liwenju0 opened 1 year ago

liwenju0 commented 1 year ago

when i read the source code of preprocess_data.py file。 i am confused with below code: in function perfect_align :

image

when call apply_transformation, the cdoe ' '.join(T[j:k]) will insert 3 spaces between tokens.

but, look at the source code of apply_transformation: image

it will call check_equal, check_casetype, check_verb, check_plural sequencely, but i think inserted space will affect these check funciton.