Question about the dataset

Firstly thanks for releasing the implementation. I'm trying to reproduce your paper's result, I have some questions about the data set you used in your paper. About the IWSLT data set:

Like the language pair Ar-En, which is the one you actually use? 2014 or 2016?
About the validation set, you have mentioned that using the official dev set and test set, is it means only using the dev2010 as the validation, or merge with the test201X of previous years?
About the Zh-En, why using the moses tokenizer for Chinese? About the WMT data set:
The official data set of WMT 16, Fi-En, Ro-En only contain 2M and 800K sentence pairs respectively, but in your experiment is 2.5M and 2.2M？

Thank you, and I look forward to hearing from you

RayeRen / multilingual-kd-pytorch

Question about the dataset #6