awasthiabhijeet / PIE

Fast + Non-Autoregressive Grammatical Error Correction using BERT. Code and Pre-trained models for paper "Parallel Iterative Edit Models for Local Sequence Transduction": www.aclweb.org/anthology/D19-1435.pdf (EMNLP-IJCNLP 2019)
MIT License
227 stars 40 forks source link

Releasing weights #2

Closed skurzhanskyi closed 4 years ago

skurzhanskyi commented 4 years ago

Hi authors, Great work with the paper! I'm interested if you're going to release weights, not of the single best model, but the one mentioned in example_scripts/README.md with F_{0.5} score close to 26.6. Thank you

awasthiabhijeet commented 4 years ago

Hi, end_to_end.sh trains the model over 1000 sentences just for demonstration purposes. I believe that the checkpoint obtained from training over just 1000 sentences is of no practical use (in comparison to the best checkpoint). Hence, I did not retain the model trained on just 1000 sentences.

skurzhanskyi commented 4 years ago

There is no practical use, but it allows evaluating the speed of your model without training on TPU (which are not widely available).

awasthiabhijeet commented 4 years ago

For evaluating inference time speedups, the pre-trained checkpoint can be utilized.

skurzhanskyi commented 4 years ago

Yes, but in such way, a lot of time goes for applying changes that model produces (because they are random).

awasthiabhijeet commented 4 years ago

Predictions from a pretrained checkpoint of the GEC model should not be random.