Hi,
I'm working in GEC for a low resource language and wanted to create datasets myself. I have some question if you can answer i will be thankful.
1) I saw training data is in parallel file format. So Should evaluating data be in M2 format? And M2 format is just for evaluating in GEC?
2) If i want to create feedback on error or show the location of the error in GEC, is parallel file format still usable or i should change the format?
3) And what approach you suggest for training model for a low resource language? Can i get help from your model in paper "A Simple Recipe for Multilingual Grammatical Error Correction"?
Hi, I'm working in GEC for a low resource language and wanted to create datasets myself. I have some question if you can answer i will be thankful. 1) I saw training data is in parallel file format. So Should evaluating data be in M2 format? And M2 format is just for evaluating in GEC?
2) If i want to create feedback on error or show the location of the error in GEC, is parallel file format still usable or i should change the format?
3) And what approach you suggest for training model for a low resource language? Can i get help from your model in paper "A Simple Recipe for Multilingual Grammatical Error Correction"?