grammatical / pretraining-bea2019

Models, system configurations and outputs of our winning GEC systems in the BEA 2019 shared task described in R. Grundkiewicz, M. Junczys-Dowmunt, K. Heafield: Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data, BEA 2019.
MIT License
50 stars 8 forks source link

When do you plan to release training scripts? #3

Closed BogdanDidenko closed 3 years ago

BogdanDidenko commented 5 years ago

We would like to reproduce your system results. Do you have any ETA when the training scripts and scripts for synthetic data generation will be released?

BogdanDidenko commented 4 years ago

Do you have any updates?

sai-prasanna commented 4 years ago

@snukky @emjotde

Lavine24 commented 4 years ago

Any update for training script? @sai-prasanna @emjotde @snukky

shizhediao commented 3 years ago

Any plans to update the training script? @sai-prasanna @emjotde @snukky Thanks!

aalayrot commented 3 years ago

@sai-prasanna @emjotde @snukky appreciate your work here. I'm a student at Stanford - and would like to expand on your work for a paper. It would be helpful if you could release the training scrips - or send them privately. Do you plan on doing that?

Thanks!

sai-prasanna commented 3 years ago

Hi, I am not involved in this project. I had commented similar to you all to get the scripts.

snukky commented 3 years ago

Hi, I've been sharing a tarball with original scripts and data privately via e-mail due to licensing, so please drop me an email.

The training scripts are very similar to those published at https://github.com/grammatical/neural-naacl2018/tree/master/training, and basically the same as https://github.com/grammatical/magec-wnut2019/tree/master/training/en

The synthetic part of data is available from: http://data.statmt.org/romang/gec-bea19/synthetic/, and a better/newer version of the data (with noise applied before splitting into subwords) can be found here: http://data.statmt.org/romang/gec-wnut19/data.en.tgz

emjotde commented 3 years ago

@snukky should we somehow say this very clearly in the readme? As it stands now it seems to be that the repository is creating expectations that won't ever be met.