kanyun-inc / fairseq-gec

Source code for paper: Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data
Other
245 stars 67 forks source link

Question about model parameters, UNK words, loss function and spell error correction system #5

Closed weiqi94 closed 5 years ago

weiqi94 commented 5 years ago

Hi, I find some parameters, learning rate and weight decay, mentioned in Section 5.2 are not consistent with the released train.sh script. So, for all single models shown in Table 5, how do you set for these parameters when you did Single Model Ablation Study?

Besides, if all single models shown in Table 5 use the edit-weighted MLE? And "Ignoring UNK words as edits" means replacing the with the source word, i.e., using the "--replace-unk" parameter or just dropping the token.

I notice you use the a statistical-based spell error correction system to pre-process the training data. How can I find this system?

zhawe01 commented 5 years ago

There is a little difference, since I rewrite all the code with the latest fairseq codebase. Yes, edit-weighted MLE are used for all the models that I reported. "Ignore UNK words" means that ignoring "UNK" as an edit when calculate the precision/recall scores. I use the spell error correction system that developed by "YuanFuDao", which is not public. But you can do the spell correction with any other spell correction systems.

weiqi94 commented 5 years ago

Thanks for your explanation.

So, does it mean we can also use the parameters in your train.sh to replicate the Single Model Ablation Study shown in Table 5?

I am also confused about the description of Figure 2 in the paper. If it is opposite? It seems the Figure 2(b) mainly focus their weights on the next word in good order.

zhawe01 commented 5 years ago

You can modify the m2scorer scripts to ignore the "UNK" edits. The parameters in the train.sh can't replicate the ablation results.

Thanks very much. I just notice that the two pictures are reversed.