cofe-ai / fast-gector

Apache License 2.0
54 stars 11 forks source link

How to obtain the Data #15

Open pribadihcr opened 1 year ago

pribadihcr commented 1 year ago

Hi, How to get the data as mention in the prepare data script SUBSET="train-stage2" SOURCE="../gec_private_train_data/${SUBSET}.src" TARGET="../gec_private_train_data/${SUBSET}.trg" OUTPUT="../gec_private_train_data/${SUBSET}.edits"

Jason3900 commented 12 months ago

you need to change the path of SOURCE and TARGET according to your dataset. .src and .trg are files which contains lines of text as original docs and its corrected version. Using the script, you can get a file with edits, and that is the one to train the model.