Open Matthieu-Tinycoaching opened 3 years ago
When you use Huggingface transformers v4, the fast tokenizer should be used by default.
There is a fast tokenizer available for XLM-R: https://huggingface.co/transformers/model_doc/xlmroberta.html#xlmrobertatokenizerfast
You can check the tokenizer like this:
print(type(model.tokenizer))
Running tokenizer on GPU is not possible (and not sensible).
Hi @nreimers
You mean this code instead?:
from transformers import AutoTokenizer
model_name = "sentence-transformers/stsb-xlm-r-multilingual"
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
print(type(tokenizer))
And I got the following answer:
<class 'transformers.models.xlm_roberta.tokenization_xlm_roberta_fast.XLMRobertaTokenizerFast'>
So this means I already use the xlmrobertatokenizerfast
?
Is there an increase of speed if I use Sentence-Transformers
instead of HuggingFace Models Repository
for the stsb-xlm-r-multilingual model (https://huggingface.co/sentence-transformers/stsb-xlm-r-multilingual)?
Yes, you use the TokenizerFast.
Speedup: Not necessarily. Sentence Transformers relies on the Tokenizer & Model from HF Transformers. It does some optimizations to reduce the padding and compute overhead when you encode a larger batch of sentences. But speed will be about the same.
Hi,
I am blocked with low latency response due to tokenizer computation from
stsb-xlm-r-multilingual
model.Could anyone have an idea on how to get a fast tokenizer for
stsb-xlm-r-multilingual
model ?Is there any way to run tokenizer on GPU ?
Thanks!