DataDog / dd-trace-py

Datadog Python APM Client
https://ddtrace.readthedocs.io/
Other
532 stars 408 forks source link

LLMObs: LLMObs.annotate with dictionary inputs forced to ascii makes unable to search on datadog.com #9949

Open kimwonj77 opened 1 month ago

kimwonj77 commented 1 month ago

Summary of problem

from ddtrace import patch_all as patch
from ddtrace.llmobs import LLMObs
from ddtrace.llmobs.decorators import agent
from openai import OpenAI
from dotenv import load_dotenv
from typing import Dict, List
load_dotenv()
LLMObs.enable(
    ml_app="test",
    integrations_enabled=True,
    agentless_enabled=True,
)
patch()

oai = OpenAI()
@agent()
def llm(messages): # type: (List[Dict[str, str]]) -> Dict[str, str]
    result = oai.chat.completions.create(messages=messages, model="gpt-4o-mini")
    message = result.choices[0].message.to_dict()
    LLMObs.annotate(
        input_data={
            "messages": messages,
            "model": "gpt-4o-mini",
        },
        output_data=message,
    )
    return message

if __name__ == "__main__":
    print(llm([
        { "role": "system", "content": "너는 ClaudeAI야." },
        { "role": "user", "content": "너는 누구니?" },
    ]))

On this exmaple code, if dictionary contains non-ascii codes, Seems like json.dumps() happend without ensure_ascii=False that makes unable to search on llm obs. (Default is True)

One of the code is here: https://github.com/DataDog/dd-trace-py/blob/ffe2201787a252edbab704f9556800d7adfdd33e/ddtrace/llmobs/_llmobs.py#L529-L544

image image (Ignore the error. It's my mistake when trying to write PoC code)

It's formatted on detail page, but can't search. image (Even sometime, formatting doesn't work on detail page if context is too long? and became data as mystery...)

Maybe we can just provided the dictionaries with json.dumps(ensure_ascii=False) on our side but feels little bit weird.

Which version of dd-trace-py are you using?

ddtrace==2.9.3

Which version of pip are you using?

pip==24.1.2

Which libraries and their versions are you using?

openai==1.35.13

How can we reproduce your problem?

See the summary's code

What is the result that you get?

some what {"messages": [{"role": "system", "content": "\ub108\ub294 ChatGPT\uc57c."}, {"role": "user", "content": "\ub108\ub294 \ub204\uad6c\ub2c8?"}], "model": "gpt-4o-mini"} on annotation and not searchable (query was "너는")

What is the result that you expected?

{"messages": [{"role": "system", "content": "너는 ChatGPT야."}, {"role": "user", "content": "너는 누구니?"}], "model": "gpt-4o-mini"} on annotation and searchable

emmettbutler commented 1 month ago

Thanks for reporting this, @kimwonj77. We'll take a look.

cc @Yun-Kim