Closed feralvam closed 4 years ago
I added the scripts I used for preprocessing into the repository (most were adapted from Moses, I think).
Thanks
Hi, What was used to truecase the files? Or are those the original sentences before being pre-processed with the scripts?
I believe they are the original sentences.
Hello,
Could you provide details on the pre-processing applied to the dataset? For example, which tokenizer was used (with which options)? Thank you.