We are excited to announce the release of MLflow 2.18.0! This release includes a number of significant features, enhancements, and bug fixes.
Python Version Update
Python 3.8 is now at an end-of-life point. With official support being dropped for this legacy version, MLflow now requires Python 3.9
as a minimum supported version.
Note: If you are currently using MLflow's ChatModel interface for authoring custom GenAI applications, please ensure that you
have read the future breaking changes section below.
Major New Features
🦺 Fluent API Thread/Process Safety - MLflow's fluent APIs for tracking and the model registry have been overhauled to add support for both thread and multi-process safety. You are now no longer forced to use the Client APIs for managing experiments, runs, and logging from within multiprocessing and threaded applications. (#13456, #13419, @WeichenXu123)
🖥️ Enhanced Trace UI - MLflow Tracing's UI has undergone a significant overhaul to bring usability and quality of life updates to the experience of auditing and investigating the contents of GenAI traces, from enhanced span content rendering using markdown to a standardized span component structure, (#13685, #13357, #13242, @daniellok-db)
🚄 New Tracing Integrations - MLflow Tracing now supports DSPy, LiteLLM, and Google Gemini, enabling a one-line, fully automated tracing experience. These integrations unlock enhanced observability across a broader range of industry tools. Stay tuned for upcoming integrations and updates! (#13801, @TomeHirata, #13585, @B-Step62)
📊 Expanded LLM-as-a-Judge Support - MLflow now enhances its evaluation capabilities with support for additional providers, including Anthropic, Bedrock, Mistral, and TogetherAI, alongside existing providers like OpenAI. Users can now also configure proxy endpoints or self-hosted LLMs that follow the provider API specs by using the new proxy_url and extra_headers options. Visit the LLM-as-a-Judge documentation for more details! (#13715, #13717, @B-Step62)
⏰ Environment Variable Detection - As a helpful reminder for when you are deploying models, MLflow now detects and reminds users of environment variables set during model logging, ensuring they are configured for deployment. In addition to this, the mlflow.models.predict utility has also been updated to include these variables in serving simulations, improving pre-deployment validation. (#13584, @serena-ruan)
Breaking Changes to ChatModel Interface
ChatModel Interface Updates - As part of a broader unification effort within MLflow and services that rely on or deeply integrate
with MLflow's GenAI features, we are working on a phased approach to making a consistent and standard interface for custom GenAI
application development and usage. In the first phase (planned for release in the next few releases of MLflow), we are marking
several interfaces as deprecated, as they will be changing. These changes will be:
Renaming of Interfaces:
ChatRequest → ChatCompletionRequest to provide disambiguation for future planned request interfaces.
ChatResponse → ChatCompletionResponse for the same reason as the input interface.
metadata fields within ChatRequest and ChatResponse → custom_inputs and custom_outputs, respectively.
Streaming Updates:
predict_stream will be updated to enable true streaming for custom GenAI applications. Currently, it returns a generator with synchronous outputs from predict. In a future release, it will return a generator of ChatCompletionChunks, enabling asynchronous streaming. While the API call structure will remain the same, the returned data payload will change significantly, aligning with LangChain’s implementation.
Legacy Dataclass Deprecation:
Dataclasses in mlflow.models.rag_signatures will be deprecated, merging into unified ChatCompletionRequest, ChatCompletionResponse, and ChatCompletionChunks.
Other Features:
[Evaluate] Add Huggingface BLEU metrics to MLflow Evaluate (#12799, @nebrass)
[Models / Databricks] Add support for spark_udf when running on Databricks Serverless runtime, Databricks connect, and prebuilt python environments (#13276, #13496, @WeichenXu123)
[Scoring] Add a model_config parameter for pyfunc.spark_udf for customization of batch inference payload submission (#13517, @WeichenXu123)
[Tracing] Standardize retriever span outputs to a list of MLflow Documents (#13242, @daniellok-db)
[UI] Add support for visualizing and comparing nested parameters within the MLflow UI (#13012, @jescalada)
We are excited to announce the release of MLflow 2.18.0! This release includes a number of significant features, enhancements, and bug fixes.
Python Version Update
Python 3.8 is now at an end-of-life point. With official support being dropped for this legacy version, MLflow now requires Python 3.9
as a minimum supported version.
Note: If you are currently using MLflow's ChatModel interface for authoring custom GenAI applications, please ensure that you
have read the future breaking changes section below.
Major New Features
🦺 Fluent API Thread/Process Safety - MLflow's fluent APIs for tracking and the model registry have been overhauled to add support for both thread and multi-process safety. You are now no longer forced to use the Client APIs for managing experiments, runs, and logging from within multiprocessing and threaded applications. (#13456, #13419, @WeichenXu123)
🖥️ Enhanced Trace UI - MLflow Tracing's UI has undergone
a significant overhaul to bring usability and quality of life updates to the experience of auditing and investigating the contents of GenAI traces, from enhanced span content rendering using markdown to a standardized span component structure, (#13685, #13357, #13242, @daniellok-db)
🚄 New Tracing Integrations - MLflow Tracing now supports DSPy, LiteLLM, and Google Gemini, enabling a one-line, fully automated tracing experience. These integrations unlock enhanced observability across a broader range of industry tools. Stay tuned for upcoming integrations and updates! (#13801, @TomeHirata, #13585, @B-Step62)
📊 Expanded LLM-as-a-Judge Support - MLflow now enhances its evaluation capabilities with support for additional providers, including Anthropic, Bedrock, Mistral, and TogetherAI, alongside existing providers like OpenAI. Users can now also configure proxy endpoints or self-hosted LLMs that follow the provider API specs by using the new proxy_url and extra_headers options. Visit the LLM-as-a-Judge documentation for more details! (#13715, #13717, @B-Step62)
⏰ Environment Variable Detection - As a helpful reminder for when you are deploying models, MLflow now detects and reminds users of environment variables set during model logging, ensuring they are configured for deployment. In addition to this, the mlflow.models.predict utility has also been updated to include these variables in serving simulations, improving pre-deployment validation. (#13584, @serena-ruan)
Breaking Changes to ChatModel Interface
ChatModel Interface Updates - As part of a broader unification effort within MLflow and services that rely on or deeply integrate
with MLflow's GenAI features, we are working on a phased approach to making a consistent and standard interface for custom GenAI
application development and usage. In the first phase (planned for release in the next few releases of MLflow), we are marking
several interfaces as deprecated, as they will be changing. These changes will be:
Renaming of Interfaces:
ChatRequest → ChatCompletionRequest to provide disambiguation for future planned request interfaces.
ChatResponse → ChatCompletionResponse for the same reason as the input interface.
metadata fields within ChatRequest and ChatResponse → custom_inputs and custom_outputs, respectively.
Streaming Updates:
predict_stream will be updated to enable true streaming for custom GenAI applications. Currently, it returns a generator with synchronous outputs from predict. In a future release, it will return a generator of ChatCompletionChunks, enabling asynchronous streaming. While the API call structure will remain the same, the returned data payload will change significantly, aligning with LangChain’s implementation.
Legacy Dataclass Deprecation:
Dataclasses in mlflow.models.rag_signatures will be deprecated, merging into unified ChatCompletionRequest, ChatCompletionResponse, and ChatCompletionChunks.
Other Features:
[Evaluate] Add Huggingface BLEU metrics to MLflow Evaluate (#12799, @nebrass)
[Models / Databricks] Add support for spark_udf when running on Databricks Serverless runtime, Databricks connect, and prebuilt python environments (#13276, #13496, @WeichenXu123)
[Scoring] Add a model_config parameter for pyfunc.spark_udf for customization of batch inference payload submission (#13517, @WeichenXu123)
... (truncated)
Commits
65d4042 Run python3 dev/update_mlflow_versions.py pre-release ... (#13816)
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency
- `@dependabot ignore major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself)
- `@dependabot ignore minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself)
- `@dependabot ignore ` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore ` will remove all of the ignore conditions of the specified dependency
- `@dependabot unignore ` will remove the ignore condition of the specified dependency and ignore conditions
Bumps the pip-others group with 2 updates: transformers and mlflow.
Updates
transformers
from 4.46.2 to 4.46.3Release notes
Sourced from transformers's releases.
Commits
052e652
v4.46.3e01a61a
FSDP grad accum fix (#34645)Updates
mlflow
from 2.17.2 to 2.18.0Release notes
Sourced from mlflow's releases.
... (truncated)
Changelog
Sourced from mlflow's changelog.
... (truncated)
Commits
65d4042
Runpython3 dev/update_mlflow_versions.py pre-release ...
(#13816)7ecea15
Remove height limitation (#13785)e9273fe
LiteLLM tracing (#13585)dd21941
Dspy tracing doc (#13807)fd1da0d
Introduce Gemini tracing (#13801)50dc2ce
Improve authentication error message (#13808)22be02e
Handle raw response in openai autolog (#13802)b4361b1
Suppress tracing for internal prediction during DSPy logging/loading (#13798)2bfdd41
Support setting retriever schema for DSPy model within constructor (#13800)9016fdb
Update Transformer's chatbot tutorial not to conflict with ChatModel tutorial...Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show