Closed lwmlyy closed 3 years ago
We used the 'wn' version. The BERT model is the 'bert-base-multilingual-cased' one.
I see, thx. I wonder if there is any special reason why the large model is not used.
There wasn't a BERT multilingual large model when the experiments were performed. (Not sure there is one now). If you need a stronger multilingual model, you can use XLM-R, it is trivial to train one with our code. I could release a pre-trained checkpoint someday, if I see interest in this.
OK, I might try it myself with XLM-R. Thanks.
Hi, nice work there. Could you please detail the multilingual dataset version ('all' or 'wn') and also the multilingual BERT version (base or large)?