Open JohnFirth opened 2 years ago
Do you mean policy such as LRU ?
One issue is, how to check how much memory the model uses ? And once exceeding memory threshold, we can evict the model from cache.
Hey @WeichenXu123
Do you mean policy such as LRU ?
Yeah, I think LRU would be suitable at least for my use case of multiple models, each being used one after the other.
One issue is, how to check how much memory the model uses ? And once exceeding memory threshold, we can evict the model from cache.
I think a simple upper limit on the number of models would be adequate, at least for me. (For my use case in fact, the limit could be 1.)
@BenWilson2 @dbczumar @harupy @WeichenXu123 Please assign a maintainer and start triaging this issue.
Hey @WeichenXu123
Do you mean policy such as LRU ?
Yeah, I think LRU would be suitable at least for my use case of multiple models, each being used one after the other.
One issue is, how to check how much memory the model uses ? And once exceeding memory threshold, we can evict the model from cache.
I think a simple upper limit on the number of models would be adequate, at least for me. (For my use case in fact, the limit could be 1.)
Hi @JohnFirth, apologies for the delay here. I think a configurable LRU cache would be great here, and we would be very excited about reviewing a PR with this feature, if you're still interested in contributing one. Please let me know if you have any questions.
No worries @dbczumar :)
Happy to help, but I'm not quite sure how to set the cache size limit, tbh.
Perhaps SparkModelCache.get_or_load
could receive a max_cache_size
argument from spark_udf
, which get_or_load
then uses to enforce the limit (?)
What about reading max_cache_size from environment variable ?
You can define it in module mlflow/environment_variables.py
@WeichenXu123 yeah, ok — I'll see what I can do :)
@WeichenXu123 Please reply to comments.
Willingness to contribute
Yes. I would be willing to contribute this feature with guidance from the MLflow community.
Proposal Summary
I'd like the ability to set a cache replacement policy for
SparkModelCache
, which currently has no policy. https://github.com/mlflow/mlflow/blob/9b83b355fc9c64ad1b51c66b1187eaab40d40d61/mlflow/pyfunc/spark_model_cache.py#L15Motivation
Performing batch inference with multiple models whose combined size would exhaust memory if loading them at the same time were attempted.
Others may wish to perform such an operation. I'm not sure how common the need is.
I'm currently performing batch inference with hundreds of models per Spark cluster, whose individual size can be up to 1GB.
The spark model cache has no replacement policy so attempting the above use case could cause an OOM. https://github.com/mlflow/mlflow/blob/9b83b355fc9c64ad1b51c66b1187eaab40d40d61/mlflow/pyfunc/spark_model_cache.py#L15
Details
Perhaps this could be configured with an environment variable, but I'm not too sure. Happy to try to supply this feature with some guidance :)
What component(s) does this bug affect?
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models
: MLmodel format, model serialization/deserialization, flavorsarea/pipelines
: Pipelines, Pipeline APIs, Pipeline configs, Pipeline Templatesarea/projects
: MLproject format, project running backendsarea/scoring
: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra
: MLflow Tracking server backendarea/tracking
: Tracking Service, tracking client APIs, autologgingWhat interface(s) does this bug affect?
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportWhat language(s) does this bug affect?
language/r
: R APIs and clientslanguage/java
: Java APIs and clientslanguage/new
: Proposals for new client languagesWhat integration(s) does this bug affect?
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsintegrations/databricks
: Databricks integrations