AI tracking does not work properly in Python's asynchronous generator scenarios.

uraurora commented 2 days ago

Environment

SaaS (https://sentry.io/)

Steps to Reproduce

First, I use an HTTP service to obtain a streaming response (SSE) from an interface, and my local interface is mainly used to relay data and report its token consumption information.
Locally, I use Python FastAPI and employ a Python asynchronous generator to yield each event.
I created a span within the asynchronous generator and used the decorator ai_track on the function. I used with sentry_sdk.start_span(op="ai.chat_completions.create.xxx", name="xxx") as span, and I'm not sure if the op value is set correctly.

Expected Result

I hope the LLM Monitoring works well, but seems only no-stream api does

Actual Result

The stream api does not show anything. I'm not sure whether there's an issue with my configuration or if this method of invocation is not currently supported.

Product Area

Insights

Link

https://moflow.sentry.io/insights/ai/llm-monitoring/?project=4508239351447552&statsPeriod=24h

DSN

No response

Version

2.19.0

getsantry[bot] commented 2 days ago

Assigning to @getsentry/support for routing ⏲️

szokeasaurusrex commented 1 day ago

Hi @uraurora, thank you for opening this issue.

I am having trouble understanding what you are trying to do, and what the problem is. Could you please provide specific steps on how to reproduce the problem? If possible, please provide a code snippet that we can run, so that we can see what you are trying to do.

uraurora commented 22 hours ago

Hi @uraurora, thank you for opening this issue.

I am having trouble understanding what you are trying to do, and what the problem is. Could you please provide specific steps on how to reproduce the problem? If possible, please provide a code snippet that we can run, so that we can see what you are trying to do.

Hi, In simple terms, I use FastAPI as the backend, and at the same time, I want to record token consumption in the LLM interface with streaming responses, but after calling the interface, it seems that there are no related displays in the Sentry dashboard(Insights-AI-LLM Monitoring). The code is as follows:

@ai_track("sentry-ai-track-test-pipeline")
async def stream():
    # assume this is a llm stream call
    with sentry_sdk.start_span(op="ai.chat_completions.create.xxx", name="sentry-ai-track-test") as span:
        token = 0
        for i in range(10):
            token += 1
            yield f"{i}"

        record_token_usage(span, total_tokens=token)

@router.post(
    "/xxx/xxx",
    response_class=EventSourceResponse,
    status_code=status.HTTP_200_OK,
)
async def sse_api(
) -> EventSourceResponse:
    return EventSourceResponse(stream())

antonpirker commented 15 hours ago

Can you link us a transaction that is in the "Performance" tab on Sentry.io that contains the spans ai.chat_completions.create.* that you are creating?

In general the spans you create must contain the data described here: https://develop.sentry.dev/sdk/telemetry/traces/modules/llm-monitoring/

If we have a link to a transaction, we can see if the spans in this transaction have the correct format.

getsentry / sentry-python