I tested this by importing a model around ~300 MB and extracted the binary_definition field from the document which results in a file containing the base64 contents. The file is around 1 MB:
2023-09-20 14:56:34,232 INFO : Creating model with id 'sentence-transformers__all-distilroberta-v1'
2023-09-20 14:56:34,854 INFO : Uploading model definition
100%|███████████████████████████████████████████████████████████████████████████| 312/312 [00:12<00:00, 24.15 parts/s]
2023-09-20 14:56:47,776 INFO : Uploading model vocabulary
2023-09-20 14:56:47,957 INFO : Model successfully imported with id 'sentence-transformers__all-distilroberta-v1'
When searching for the chunks there were ~300 documents which means were are correctly storing the model in 1 MB chunks.
This PR reduces the chunk size of the model stored in Elasticsearch from 4 MB to 1 MB. We've seen less memory pressure by using 1 MB chunks.
Part of issue: https://github.com/elastic/elasticsearch/issues/99409
Related PR: https://github.com/elastic/elasticsearch/pull/99677
Testing
I tested this by importing a model around ~300 MB and extracted the
binary_definition
field from the document which results in a file containing the base64 contents. The file is around 1 MB:When searching for the chunks there were ~300 documents which means were are correctly storing the model in 1 MB chunks.
Result: