bash file as example - Githubissues

ruiyeNLP commented 1 year ago

Hi all, thanks for your excellent work again. Could please add a bash file or an example to show how to train GECToR? I went through your paper but remain unsure about how to train the GECToR. The main thing I am unclear about is the three stages for training with many datasets in different formats involved. It would help a lot if you could add an example or a bash file for one whole training pipeline. Looking forward to your reply.

mughal41 commented 1 year ago

Hi @ruiyeNLP Please have a look at the project's README section if you want to reproduce their results as stated in their paper.

To start training the model these are the steps you should follow:

Gather the data for the first stage could be found here as it was mentioned in project's README --> the Dataset section.
You'll convert the m2 format file into 2 parallel files, i guess it'll generate something like a corr_sent.txt and a incorr_sent.txt
Now that u have generated 2 parallel files for both train and dev sets, using the project's pre-processing script described in the README, you have to generate train_set.txt and dev_set.txt
Now load these files up into train.py and train your model following these params

skurzhanskyi commented 1 year ago

Thank you, @mughal41. You are absolutely right

Lj4040 commented 1 year ago

@mughal41 I would like to ask you how to convert this m2 format file into 2 parallel files. I would like to ask for your help.Was it generated from the error.py file?

grammarly / gector

bash file as example #176