DAGWorks-Inc / burr

Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastructure.
https://burr.dagworks.io
BSD 3-Clause Clear License
1.11k stars 57 forks source link

Capture TTFT with streaming #329

Closed skrawcz closed 2 weeks ago

skrawcz commented 3 weeks ago

Is your feature request related to a problem? Please describe. TTFT is not captured for streaming & async streaming.

Describe the solution you'd like We should capture TTFT if it's a streaming action and add it to latency view.

Describe alternatives you've considered N/A

Additional context This is a framework and UI change.

elijahbenizzy commented 3 weeks ago

Options:

  1. Encode it in a step's log -- capture:
    • time of start
    • time of first "token" (generator first yield)
    • time of last "token"
    • Number of tokens
  2. Encode it as an attribute

Either way, we'll need the following hooks:

  1. post_stream_start after we initialize the stream
  2. post_stream_step(index) after every yield -- would count, after the first would
  3. post_stream_end(index) after the end of the stream

Then I think we should just record this as part of the step_end_log or something, or have an optional step_profile log that we can render.

elijahbenizzy commented 2 weeks ago

See #331 -- has this + a lot more