Closed joshdevins closed 10 months ago
@srikanthmanvi @pquentin @technige hey ya'll, wanted to flag that this issue in eland is preventing users from using the e5 small model (easiest to use) in Elasticsearch. It would be amazing if the next version that released for Eland fixed this issue so we can provide support for the e5 small model.
@srikanthmanvi @pquentin @technige hey ya'll, wanted to flag that this issue in eland is preventing users from using the e5 small model (easiest to use) in Elasticsearch. It would be amazing if the next version that released for Eland fixed this issue so we can provide support for the e5 small model.
@serenachou thanks for the ping. We will prioritize this.
Please note that this applies only to the multilingual E5 model. The normal e5-small-v2
works just fine.
@srikanthmanvi if this isn't on your radar for 8.12 - we would love this to be included into 8.12, or any earlier version that you and @pquentin are cooking up because after 8.12, we would likely be looking to prepared models, so this work would be less effective as a way to encourage customers to use this model for multilingual use cases
Discussed f2f, @davidkyle will have a look and I will support.
Any reason to support intfloat/multilingual-e5-small
but not intfloat/multilingual-e5-base
or intfloat/multilingual-e5-large
? It would be nice if those models worked as well.
Taking a look at the code updates, it seems this fix should affect the base
and large
models as well. My issue may be unrelated to this then, but I'm experiencing an issue where the first infer request after deployment works for the larger models, but the second request and onward returns a vector of all zeros. multilingual-e5-small
works as expected though. Is this likely to be an eland
bug or something better addressed in the elasticsearch repo?
@ialdencoots thanks for reporting the problem I have reproduced it myself. You can track the issue at https://github.com/elastic/elasticsearch/issues/102541
The bug fix linked above only applies to the small
model, the error you are seeing is a different issue.
intfloat/multilingual-e5-small
works well Elastic, but note that the E5 models are trained with prefix strings which should be used for information retrieval. See https://huggingface.co/intfloat/multilingual-e5-base#faq. Prefix string support has been added to Elasticsearch in https://github.com/elastic/elasticsearch/pull/102089 and will be available in the next release (8.12)
At present, the model is processed and uploaded fine, but when starting the model, it fails:
The model is trained from Multilingual-MiniLM which is a BERT model, but uses the XLM-RoBERTa tokenizer. Since we wrap models based on their architecture and not on the tokenizer type, the BERT model expects input that isn't coming from the XLM-RoBERTa tokenizer. We should consider changing how we decide which wrapper to use (three inputs or two) based on the tokenizer instead.
Note that the base and large model variants work fine because they are XLM-RoBERTa models, and use the corresponding tokenizer.