Using different bert for training - english predictions

I have dataset of comments in foreign language, but I don't have summaries for them so I cannot use them for training. For my project I wanted to try to replace bert english model in your code with bert multilingual and bert that combines my foreign language and english. I was expecting to see better results. However, when I tested my dataset on your available abstractive trained model that uses bert english, at least I got words in my language in predictions, but when I use model trained with different bert I get english words in predictions. Do you have an idea what could be the problem, or maybe I am missing something important?

nlpyang / PreSumm

Using different bert for training - english predictions #193