Closed shikha10799 closed 4 years ago
Hi @shikha10799 ,
As mentioned in our paper, we used One-billion-word corpus to create the artificial GEC corpus.
Also Kindly provide your suggestions on how i can proceed in constructing a dataset with just preposition errors.
If you already have a decent size GEC corpus, you can estimate the transition probabilities of the preposition errors. (E.g. Probability of Prep-1 being wrongly used in place of Prep-2 etc.), and introduce errors by using these estimates?
Hi you mentioned in readme that in order to construct errorful sentences we need to specify the path to a correct file along with an output path.My question is " from which source did you extracted the correct sentences to form the erraneous dataset provided in the repository?" Since i also want to construct an erraneous dataset of preposition errors but first i need a correct dataset for that. Also Kindly provide your suggestions on how i can proceed in constructing a dataset with just preposition errors. Thanks in advance.