I realized the preprocessing scripts in the OPUS-MT-Train library did not match the ones being published in the OPUS models repository. I am thinking the preprocess scripts in the training library (scripts/) are outdated, because when i used those to train my own model, i ran into issues. I updated those to the attached script (one I pulled from a model in the repo) and things went smoothly. I just want to make sure I am correct in replacing it. This is for building a SPM model, so I replaced scripts/preprocess-spm.sh with the attached file.
preprocess.sh.txt
I realized the preprocessing scripts in the OPUS-MT-Train library did not match the ones being published in the OPUS models repository. I am thinking the preprocess scripts in the training library (scripts/) are outdated, because when i used those to train my own model, i ran into issues. I updated those to the attached script (one I pulled from a model in the repo) and things went smoothly. I just want to make sure I am correct in replacing it. This is for building a SPM model, so I replaced scripts/preprocess-spm.sh with the attached file. preprocess.sh.txt