Unable to save expert outputs when data is large

Hi, Firstly thanks for releasing the implementation. I had a couple of issues when trying things out.

I tried training the teacher for single language pair "en-de" instead of using all languages by using wmt17 data instead of IWSLT so I started training the 3.9 Million sentences and after 9 epoch of training I wanted to save the output for which I got

as you can see the outputs haven't been saved. To check if I made some mistake in changing the scrips or whether it was a limitation of bigger data set here I used a smaller sub set of WMT17 dataset containing 0.9 Million sentences for which I got the log as below. This time the top_k probabilites and indices got saved and I got the same Saved expert@en_de in log as shown which made me confirm that for larger dataset like 3.9 Million the output somehow is not saved.

There is no option to provide --srcdict and --tgtdict as done here in preprocess.py so every time a new joined dictionary has been created when I wanted to use a subset of data for training.

Thank you once again.

RayeRen / multilingual-kd-pytorch

Unable to save expert outputs when data is large #1