Open harshilprajapati96 opened 3 months ago
One question:
What if you add metric computation value by span.set_outputs/attributes
? Any essential difference that makes a metric can't be added as an output/attribute ?
Q2: "tracing" is for LLM inference which is performance sensitive workload. Generally, in LLM inference tracing, we should avoid introducing too much overhead. If you want to evaluating metrics which is a heavy workload, it seems tracing is not a good place to do it. (maybe adding similar functionality in mlflow.models.evaluate
is better)
cc @B-Step62
Right now I am just starting an overarching span and then adding it as its attributes
mlflow_tracer = MlflowLangchainTracer()
with mlflow.start_span(name="evaluate") as span:
span.set_inputs(input)
output = chain.invoke(input, config={"callbacks": [mlflow_tracer]})
span.set_outputs(output)
metrics = evaluate_chain(expected, output)
span.set_attributes(metrics)
This works for now but having queryable metrics would be nice to have.
How do I set tags in the span? I tried adding tags in Context and passing in MlflowLangchainTracer but that doesn't work.
I feel having metrics on traces would make it easier debug, even if we could track trace id and then later calculate and add. We use langfuse score for that right now https://langfuse.com/docs/scores/overview
for setting tags, you can use
MlflowClient().set_trace_tag(span.request_id, "key", "value")
Any thoughts on this FR?
@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.
Willingness to contribute
Yes. I would be willing to contribute this feature with guidance from the MLflow community.
Proposal Summary
For LLM traces, particularly with the MlflowLangchainTracer, it would be beneficial to include columns for metrics that can be added with each invocation.
Motivation
Details
One way could be to add
span.set_metrics
and log withUpdating UI to show all metrics as columns as we see for runs and make it queryable via client
What component(s) does this bug affect?
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/deployments
: MLflow Deployments client APIs, server, and third-party Deployments integrationsarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models
: MLmodel format, model serialization/deserialization, flavorsarea/recipes
: Recipes, Recipe APIs, Recipe configs, Recipe Templatesarea/projects
: MLproject format, project running backendsarea/scoring
: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra
: MLflow Tracking server backendarea/tracking
: Tracking Service, tracking client APIs, autologgingWhat interface(s) does this bug affect?
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportWhat language(s) does this bug affect?
language/r
: R APIs and clientslanguage/java
: Java APIs and clientslanguage/new
: Proposals for new client languagesWhat integration(s) does this bug affect?
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsintegrations/databricks
: Databricks integrations