Data Format For GEC - Githubissues

Hi, I'm working in GEC for a low resource language and wanted to create datasets myself. I have some question if you can answer i will be thankful. 1) I saw training data is in parallel file format. So Should evaluating data be in M2 format? And M2 format is just for evaluating in GEC?

2) If i want to create feedback on error or show the location of the error in GEC, is parallel file format still usable or i should change the format?

3) And what approach you suggest for training model for a low resource language? Can i get help from your model in paper "A Simple Recipe for Multilingual Grammatical Error Correction"?

google-research-datasets / clang8

Data Format For GEC #13