Open oreh opened 3 years ago
I created a pull request to demonstrate how to use a flag to bypass docker build. https://github.com/mlflow/mlflow/pull/3742
I created a pull request to demonstrate how to use a flag to bypass docker build.
3742
@oreh I tested your PR but it does not work unless the exact image is specified in the kube config.
I propose to solve the same exact problem you are having in a slightly different way in this PR #3987: you do not have to specify the image that is supposed to be used in the kube context. The image that is built on your dev machine will automatically be used for all other jobs started from within the first pod.
What do you think about this?
It seems this feature are not in release. I tested both PR https://github.com/mlflow/mlflow/pull/3742 and https://github.com/mlflow/mlflow/pull/3987, and both works well. I would love to see this feature get merged to the release by the mlflow developers. Thanks!
@oreh @fg91 @ShuxinLin has tested it extensively and it works.
Thank you for submitting a feature request. Before proceeding, please review MLflow's Issue Policy for feature requests and the MLflow Contributing Guide.
Please fill in this feature request template to ensure a timely and thorough response.
Willingness to contribute
The MLflow Community encourages new feature contributions. Would you or another member of your organization be willing to contribute an implementation of this feature (either as an MLflow Plugin or an enhancement to the MLflow code base)?
Proposal Summary
The current 'kubernetes' backend requires users to rebuild and push docker images for every run. However, this makes it difficult to launch mlflow from a pod inside a Kubernetes cluster, as one normally cannot build docker image inside a pod. We should provide an option to users to launch mlflow runs with existing docker images, which could be built by some other tools or processes.
Motivation
To my understanding, the reason why we need to build a new docker image for each mlflow run is to package code and data in the working directory into the image. But
Moreover, rebuilding and pushing images for each run is also a blocker to deploy the entire mlflow stack into a Kubernetes cluster. We prefer to develop mlflow projects in K8S Pods and launch multiple runs directly using the kubernetes backend. However, we won't allow user to build and push docker images inside a Pod for security reasons.
So it would be nice if we allows mlflow users to use a flag to skip docker image building/pushing when start runs with the 'kubernetes' backend.
What component(s), interfaces, languages, and integrations does this feature affect?
Components
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models
: MLmodel format, model serialization/deserialization, flavorsarea/projects
: MLproject format, project running backendsarea/scoring
: Local serving, model deployment tools, spark UDFsarea/server-infra
: MLflow server, JavaScript dev serverarea/tracking
: Tracking Service, tracking client APIs, autologgingInterfaces
area/uiux
: Front-end, user experience, JavaScript, plottingarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportLanguages
language/r
: R APIs and clientslanguage/java
: Java APIs and clientslanguage/new
: Proposals for new client languagesIntegrations
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsintegrations/databricks
: Databricks integrationsDetails
(Use this section to include any additional information about the feature. If you have a proposal for how to implement this feature, please include it here. For implementation guidelines, please refer to the Contributing Guide.)