Closed zijwang closed 4 years ago
We'd do that as soon as we can (all authors are now busy with emnlp & coling conference submissions).
Thanks, @datquocnguyen ! Hope everything went well with the submissions. It would also be nice to have the tokenizer built using Huggingface API to make the whole pipeline simpler (without fairseq).
I am waiting for tranformers' developers to merge my pull request. In the meantime, you can use the BERTweet from this folk: https://github.com/datquocnguyen/transformers
git clone https://github.com/datquocnguyen/transformers
cd transformers
pip install .
Example is available at: https://github.com/datquocnguyen/transformers/tree/master/model_cards/vinai/bertweet-base
bertweet = BertweetModel.from_pretrained("vinai/bertweet-base")
tokenizer = BertweetTokenizer.from_pretrained("vinai/bertweet-base")
It would be nice to have the model also hosted on huggingface (https://huggingface.co/models), so people could use it from the huggingface API without manually downloading the model dump.