asofiaoliveira / srl_bert_pt

Portuguese BERT and XLM-R models fine-tuned in semantic role labeling.
Apache License 2.0
22 stars 3 forks source link

srl-en-xlmr tokenizer not uploaded to HF 🤗 #3

Closed HaritzPuerto closed 3 years ago

HaritzPuerto commented 3 years ago

Hi,

Thanks for your work! I appreciate all the documentation and uploading the models to 🤗 . It is really useful!

I am trying to use the srl-en-xlmr but it seems that the tokenizer is not uploaded. I got a TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType error when loading the tokenizer. You can see it in this colab notebook

https://colab.research.google.com/drive/1SZqTNkbTK4dUPK1PjdC438YAGkhECYeE?usp=sharing

Is the tokenizer the same one as tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-base")?

Thank you

asofiaoliveira commented 3 years ago

Hi, Thanks for bringing this to my attention! The tokenizer is indeed the same as xlm-roberta-base, but it should also work with tokenizer = AutoTokenizer.from_pretrained("liaad/srl-en-xlmr-base") There was a mistake in the config files for my XLM-R models that made them unable to work with AutoTokenizer. I've fixed it and it should now work. Please re-open this issue if it does not and I'll look into it again.