The results seems different from hugging face...

Thank you for the great model. I tried this model on our lab experiment machine. But the result seems different from that running on hugging face.

I used this model: https://huggingface.co/cl-tohoku/bert-base-japanese-whole-word-masking?text=%E3%83%AA%E3%83%B3%E3%82%B4%5BMASK%5D%E9%A3%9F%E3%81%B9%E3%82%8B%E3%80%82

And I wrote: リンゴ[MASK]食べる。

The model on the web gives that: リンゴを食べる。 0.870 リンゴも食べる。 0.108 リンゴは食べる。 0.009 リンゴのみ食べる。 0.005 リンゴとともに食べる。 0.001

And I download the model, run it locally. The output is: ['リンゴ', '[MASK]', '食べる', '。'] Some weights of the model checkpoint at /home/Xu_Zhenyu/JapaneseBERTModel/cl-tohoku/bert-base-japanese-whole-word-masking/ were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']

This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). 0 を 1 、 2 も 3 野菜 4 で

The results[を　も　は　のみ　とともに] and [を　、　も　野菜　で] is different, why?

And I have another question, there are 0.870, 0.108, 0.009 etc on the web. How can I get those numbers locally?

Thank you for your time.

cl-tohoku / bert-japanese

The results seems different from hugging face... #26