[FR] Allow openai flavor to target deployment server instead of raw OAI endpoint

akshaya-a commented 2 months ago

Willingness to contribute

Yes. I can contribute this feature independently.

Proposal Summary

maybe better designed as a load-time arg vs my hardcoded patch?

mlflow.pyfunc.load_model(, model_config={"endpoint/deployment_id": "foo"})

and oai auto injects that as a model config

an even cooler feature might be to just have a global that is essentially

mlflow.prefer_deployment_server() equivalent (env var?) that automatically applies so that consumption code doesn't have to mutate and it's a config-level op.

I'm right now using a simple heuristic to match the model with deployment server (provider, task, model id prefix) which works fine

Motivation

What is the use case for this feature?

Why is this use case valuable to support for MLflow users in general?

Why is this use case valuable to support for your project(s) or organization?

Why is it currently difficult to achieve this use case?

Details

No response

What component(s) does this bug affect?

[ ] area/artifacts: Artifact stores and artifact logging
[ ] area/build: Build and test infrastructure for MLflow
[X] area/deployments: MLflow Deployments client APIs, server, and third-party Deployments integrations
[ ] area/docs: MLflow documentation pages
[ ] area/examples: Example code
[ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
[X] area/models: MLmodel format, model serialization/deserialization, flavors
[ ] area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
[ ] area/projects: MLproject format, project running backends
[ ] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
[ ] area/server-infra: MLflow Tracking server backend
[ ] area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

[ ] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
[ ] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
[ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
[ ] area/windows: Windows support

What language(s) does this bug affect?

[ ] language/r: R APIs and clients
[ ] language/java: Java APIs and clients
[ ] language/new: Proposals for new client languages

What integration(s) does this bug affect?

[ ] integrations/azure: Azure and Azure ML integrations
[ ] integrations/sagemaker: SageMaker integrations
[ ] integrations/databricks: Databricks integrations

daniellok-db commented 2 months ago

I think this makes sense! being able to specify this at load time sounds like a good idea.

github-actions[bot] commented 2 months ago

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.

mlflow / mlflow