huggingface / text-embeddings-inference

A blazing fast inference solution for text embeddings models
https://huggingface.co/docs/text-embeddings-inference/quick_tour
Apache License 2.0
2.84k stars 177 forks source link

Support Alibaba-NLP/gte-multilingual-base & Alibaba-NLP/gte-multilingual-reranker-base #366

Closed sigridjineth closed 1 month ago

sigridjineth commented 3 months ago

Model description

The GTE model family has consistent support on Hugging Face TEI, but the newly introduced gte-multilingual-base fails to be supported at this moment.

2024-08-03T08:44:51.913133Z  INFO download_pool_config: text_embeddings_core::download: core/src/download.rs:38: Downloading `1_Pooling/config.json`
2024-08-03T08:44:52.701241Z  INFO download_new_st_config: text_embeddings_core::download: core/src/download.rs:62: Downloading `config_sentence_transformers.json`
2024-08-03T08:44:52.885881Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:21: Starting download
2024-08-03T08:44:52.885905Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:23: Downloading `config.json`
2024-08-03T08:44:53.270526Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:26: Downloading `tokenizer.json`
2024-08-03T08:44:53.749799Z  INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:331: Downloading `model.safetensors`
2024-08-03T08:44:57.772619Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:32: Model artifacts downloaded in 4.886736258s
Error: `config.json` does not contain `id2label`

The same error happens for both biencoder and reranker.

Open source status

Provide useful links for the implementation

https://huggingface.co/Alibaba-NLP/gte-multilingual-base https://huggingface.co/Alibaba-NLP/gte-multilingual-reranker-base

kozistr commented 3 months ago

TEI determines the type of backend model by the architectures in config.json code.

However, gte-multilingual-base architectures in config.json are NewModel and NewForTokenClassification. I guess it'd be good to change the config (e.g. removing NewForTokenClassification from the architectures)

the reranker has a correct architecture name, but the config does not contain id2label. So, I assume adding id2label seems good (or loading as a reranker model if there's no id2label key in the config.json).

In short, opening a PR to remove/add the config can solve the issues.

nbroad1881 commented 3 months ago

https://huggingface.co/Alibaba-NLP/gte-multilingual-base/discussions/7#66bfb82ea03b764ca92a2221

sigridjineth commented 3 months ago

@nbroad1881 yup, I've found that renaming weight keys to remove prefixes like "new" and ensuring they follow the standard model naming pattern (e.g., "encoder.layer.0.attention.self.query.weight") improves compatibility for its very own.

but removing it would raise not using sparse weight predictions as mentioned so that need to look at more

OlivierDehaene commented 1 month ago

Thanks for the PR on the repositories @nbroad1881! Yes, at the end of the day, model creators are free to follow whatever naming scheme, conventions... they want and they sometimes clash with the ones we setup in our repositories unfortunately.