Tracing examples and configs when model serving / calling LLM

When serving models / LLMs, latency is often very important but can be difficult to attribute, especially in complicated systems with multiple steps and systems.

Logs can be super helpful for troubleshooting, but collating logs across multiple systems isn't very easy and logs aren't easily consumed by humans. Tracing is often seen as a better alternative as it gives gantt charts out of the box, even with multiple systems with tools like Jaeger.

This issue is to track showing how to implement tracing for Databricks ML serving applications.

christophergrant / databricks-opentelemetry

Tracing examples and configs when model serving / calling LLM #1