elastic / eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
https://eland.readthedocs.io
Apache License 2.0
635 stars 98 forks source link

[Import Model] High transient memory usage when installing large models with Docker #566

Open davidkyle opened 1 year ago

davidkyle commented 1 year ago

Using the docker container to run eland_import_hub_model and install a large model imposes large memory requirements. For example using the command below to install xlm-roberta-base requires the container to have more than 8GB of memory.

docker run -it --rm --network host \
    elastic/eland \
    eland_import_hub_model \
      --url 'https://elastic:XXX@host:9200/' \
      --hub-model-id xlm-roberta-base \
      --task-type fill_mask

If the container does not have enough memory the process exits shortly after the download has completed prior to uploading the model to Elasticsearch. Investigate the what is causing the high memory usage.