elastic / eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
https://eland.readthedocs.io
Apache License 2.0
635 stars 98 forks source link

[NLP] Add model compatibility check mode to eland_import_hub_model.py #599

Open davidkyle opened 12 months ago

davidkyle commented 12 months ago

The best way to check if a model is compatible with Elasticsearch is to run the eland_import_hub_model.py script but this requires the user to configure the Elasticsearch connection settings and possibly even spin up an Elasticsearch cluster in order to get past the connection test.

Add a command option to the script that will download the model and check it is compatible with Elastic without uploading the model to Elasticsearch. This is a convenience to quickly check if a model is compatible with minimal configuration.

Invoking the script might look like:

    eland_import_hub_model \
      --is-compatible
      --hub-model-id 'sentence-transformers/msmarco-MiniLM-L-12-v3 \
      --task-type text_embedding