Closed pum-purum-pum-pum closed 6 years ago
Hi! For the training set, I replaced all singletons with <unk>
. For dev/test sets they are the same as the official release :)
Thank you, again. And you just somehow concatenated all these xml files for dev/test. I'm looking at this one: https://wit3.fbk.eu/mt.php?release=2014-01
I used the data preprocessing scripts at https://github.com/harvardnlp/BSO/tree/master/data_prep
. Hope this helps!
I tried it and it just works :) Thank you so much!
Can you please tell which preprocessing did you use? I found that original IWSLT consist of some xml files. Thank you!