elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.48k stars 24.88k forks source link

E5 model file based access, Fails to deploy the model #112469

Open ivssh opened 2 months ago

ivssh commented 2 months ago

Elasticsearch Version

8.15

Installed Plugins

No response

Java Version

bundled

OS Version

Ubuntu 22.04 LTS

Problem Description

Documentation followed: https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-e5.html#_using_file_based_access_2

Trying to deploy the E5 multilingual model in an air-gapped environment. Facing the below error:

Error: org.elasticsearch.ElasticsearchException: Failed to load vocabulary file; java.lang.ClassCastException: class java.lang.Integer cannot be cast to class java.lang.Double (java.lang.Integer and java.lang.Double are in module java.base of loader 'bootstrap')

I suspect an issue with the vocabulary file where scores are being cast to Integer instead of Double.

Steps to Reproduce

Steps are enlisted in the documentation: https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-e5.html#_using_file_based_access_2

Logs (if relevant)

No response

elasticsearchmachine commented 2 months ago

Pinging @elastic/ml-core (Team:ML)

davidkyle commented 2 months ago

I could not reproduce this on my local machine (macOS). Can you share the full stack trace please?

I copied all the files listed here to ES_HOME/config/models

$ ls -alh config/models/

-rw-rw-r--   1 davidkyle  x   835B  9 Sep 16:01 multilingual-e5-small.metadata.json
-rw-rw-r--   1 davidkyle  x   448M  9 Sep 16:01 multilingual-e5-small.pt
-rw-rw-r--   1 davidkyle  x    11M  9 Sep 16:01 multilingual-e5-small.vocab.json

And added this to my elasticsearch.yml

xpack.ml.model_repository: file://${path.home}/config/models/

After starting Elasticsearch and Kibana I was able to install the model from the UI.

Did you make this change on every master eligible node?

ivssh commented 2 months ago

I made the change on every master eligible node. The file I am referring to is this where I suspect the issue is. I did this on a linux machine with Elastic 8.15 installed with bundled java.

davidkyle commented 2 months ago

Thanks @ivssh. I still cannot reproduce this, please could you provide the full stack trace and a copy of your multilingual-e5-small.vocab.json file. Thanks