huggingface / api-inference-community

Apache License 2.0
161 stars 61 forks source link

[`sentence-transformers`] Add sentencepiece dependency for running models with slow tokenizers only #432

Open tomaarsen opened 2 months ago

tomaarsen commented 2 months ago

Hello!

Pull Request overview

Details

See e.g. https://huggingface.co/rokn/slovlo-v1: image

sentencepiece was removed as a required dependency for sentence-transformers, because most models don't require it anymore and it doesn't work well with Python 3.12 on Windows 11. Nowadays, we simply throw an error if it's required but not installed. However, the API Inference should have it installed to ensure that we can also run these models with slow tokenizers. I've verified locally that sentencepiece==0.2.0 works for e.g. that slovlo-v1 model.

tomaarsen commented 2 months ago

The same issue exists for some Camembert models, e.g.: https://huggingface.co/Photon-BR/sentence-camembert-large