Closed JamesArthurHolland closed 3 years ago
Hi James,
Thank you for your interest in our work. You can access the pretrained multilingual model by passing the ID bert-base-multilingual-cased
to the --model_name_or_path
argument. This will download a Pytorch version of the model from the HuggingFace model repository.
However, as a side note, you might want to fine-tune it on some multilingual WSD datasets other than SemCor for it to work well under multilingual settings.
Hi BPYap,
Thanks for your speedy response.
Where does it download the model to? The model folder?
Can I pass the id to the demo_model script? I'm currently trying to do that but it seems to stall at "loading the model", with nothing appearing in the model folder.
The model is downloaded to the system's cache directory, in my case (Windows 10) it was C:\Users\<your_username>\.cache\torch\transformers\
.
About the demo_model
script, apparently it does work for model ID (though not intended), here's my console output:
python script\demo_model.py "bert-base-multilingual-cased"
To use data.metrics please install scikit-learn. See https://scikit-learn.org/stable/index.html
Loading model...
Enter a sentence with an ambiguous word surrounded by [TGT] tokens
> He caught a [TGT] bass [TGT] yesterday.
Progress: 100%|██████████████████████████████████████████████████████████████████████████| 9/9 [00:00<00:00, 13.14it/s]
Predictions:
No. Sense key Definition Score
----- ------------------- --------------------------------------------------------------------------------------------------- -------
1 bass%1:18:00:: an adult male singer with the lowest voice 0.11432
2 bass%1:10:01:: the lowest part in polyphonic music 0.11414
3 bass%1:10:00:: the lowest adult male singing voice 0.11288
4 bass%5:00:00:low:03 having or denoting a low vocal or instrumental range 0.11221
5 bass%1:07:01:: the lowest part of the musical range 0.11164
6 bass%1:06:02:: the member with the lowest range of a family of musical instruments 0.10972
7 bass%1:13:02:: the lean flesh of a saltwater fish of the family Serranidae 0.10837
8 bass%1:05:00:: nontechnical name for any of numerous edible marine and freshwater spiny-finned fishes 0.10836
9 bass%1:13:01:: any of various North American freshwater fish with lean flesh (especially of the genus Micropterus) 0.10836
Enter a sentence with an ambiguous word surrounded by [TGT] tokens
>
Might need to give it a few minutes for it to be downloaded completely, it appears stuck because the download progress bar is not being displayed for some reasons.
Thanks that worked
Hi,
I'm trying to use
BERT-Base, Multilingual Cased
from google-researchBut it's taking very long to load. I suspect an infinite loop.
The files in the model folder have different names than the models available from BERT-WSD.
How can I make this other model compatible? I need multilingual support.