grammarly / gector

Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)
Apache License 2.0
891 stars 216 forks source link

stage 2 training data problem #181

Open Lj4040 opened 1 year ago

Lj4040 commented 1 year ago

Phase 2 requires data such as Fce and W&I+LOCNESS, which is M2 file format after I have downloaded it. How can I convert this format into two parallel files? I really need your help

gotutiyan commented 1 year ago

I don't know of any de facto standard methods, but I think these scripts can be used reliably.

M2Convertor: https://github.com/Jason3900/M2Convertor

convert_m2_to_parallel.py https://github.com/kanekomasahiro/gec_tutorial/blob/main/src/convert_m2_to_parallel.py

skurzhanskyi commented 1 year ago

@gotutiyan, thank you for providing the links

Lj4040 commented 1 year ago

@skurzhanskyi @gotutiyan I really appreciate your help and answer, thank you for your reply,