kanyun-inc / fairseq-gec

Source code for paper: Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data
Other
246 stars 67 forks source link

Questions regarding checkpoints and interactive.sh #9

Closed liusiyi641 closed 5 years ago

liusiyi641 commented 5 years ago

To run interactive.sh, it requires a model file ./out_big_art/models_denoise/checkpoint5.pt. I created such directory and copied the checkpoint from out/model/checkpoint5.pt and it can still run and give seemingly good results. How is this supposedly existing ./out/models_denoise/checkpoint5.pt different from checkpoints in out/model/, i.e., can I use checkpoints from out/model to run interactive.sh and will it give different results? And what are the ./out/modelscheckpointema.pt for? How are they different from normal checkpoints in the same directory?

Thank you so much for all the amazing work and gracious help!

zhawe01 commented 5 years ago

I hard coded the checkpoint url. You should change the checkpoint url to the one that you want to test on (for example, the models that you trained). I only published one model, which is the pre-trained model.

liusiyi641 commented 5 years ago

Yes. These models are checkpoints I got after training with the pre-trained model. So I think I'm doing the right thing then. Thank you. May I still ask what is the difference between checkpoint5.pt and checkpointema5.pt?

zhawe01 commented 5 years ago

We use saved two models for each checkpoint, with and without EMA. The model with ema should perform better than the one without ema.