Agenta-AI / agenta

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM Observability all in one place.
http://www.agenta.ai
MIT License
1.28k stars 188 forks source link

[Bug] 500 Internal server error on otlp traces #2280

Open ar-or opened 1 hour ago

ar-or commented 1 hour ago

Describe the bug When a python script provisioned with ag.init() tries to send traces, the agenta backend responds with 500

To Reproduce

  1. Create a python script with ag.init()
  2. start agenta backend
  3. start your screen
  4. observe error
agenta-backend-1           | INFO:agenta_backend.utils.project_utils:Retrieving project_id from request...
agenta-backend-1           | INFO:agenta_backend.utils.project_utils:No project ID found in the request
agenta-backend-1           | INFO:agenta_backend.services.auth_helper:Retrieving default project from database...
agenta-backend-1           | INFO:agenta_backend.services.auth_helper:Default project fetched: 019344f8-3568-73c9-a61e-1997d4e6d501 and set in request.state
agenta-backend-1           | Traceback (most recent call last):
agenta-backend-1           |   File "/app/agenta_backend/core/observability/utils.py", line 152, in parse_ingest_value
agenta-backend-1           |     attributes[key] = to_type(attributes[key])
agenta-backend-1           |   File "/usr/local/lib/python3.9/uuid.py", line 177, in __init__
agenta-backend-1           |     raise ValueError('badly formed hexadecimal UUID string')
agenta-backend-1           | ValueError: badly formed hexadecimal UUID string
agenta-backend-1           | Traceback (most recent call last):
agenta-backend-1           |   File "/app/agenta_backend/apis/fastapi/shared/utils.py", line 12, in wrapper
agenta-backend-1           |     return await func(*args, **kwargs)
agenta-backend-1           |   File "/app/agenta_backend/apis/fastapi/observability/router.py", line 135, in otlp_receiver
agenta-backend-1           |     await self.service.ingest(
agenta-backend-1           |   File "/app/agenta_backend/core/observability/service.py", line 58, in ingest
agenta-backend-1           |     parse_ingest(span_dtos)
agenta-backend-1           |   File "/app/agenta_backend/core/observability/utils.py", line 170, in parse_ingest
agenta-backend-1           |     for key in attributes.keys():
agenta-backend-1           | RuntimeError: dictionary changed size during iteration
agenta-backend-1           | ERROR : support_id=e36bd9b4-3267-4973-bfc7-cdaf6d4b9f5d & operation_id=otlp_receiver
agenta-backend-1           | INFO:     172.20.0.11:55722 - "POST /api/observability/v1/otlp/traces HTTP/1.1" 500 Internal Server Error

Expected behavior Expected no error

Network Logs via tcp dump between containers:

12:24:52.476780 IP (tos 0x0, ttl 64, id 56393, offset 0, flags [DF], proto TCP (6), length 311)
    172.20.0.11.55722 > 172.20.0.9.8000: Flags [P.], cksum 0x5966 (incorrect -> 0x0b91), seq 1385953087:1385953346, ack 2373849875, win 502, options [nop,nop,TS val 4045234902 ecr 841428240], length 259

E..7.I@.@..;.......     ...@R..?.~......Yf.....
..b.2'-.

POST /api/observability/v1/otlp/traces HTTP/1.1
Host: agenta-backend:8000
User-Agent: OTel-OTLP-Exporter-Python/1.28.2
Accept-Encoding: gzip, deflate, zstd
Accept: */*
Connection: keep-alive
Content-Type: application/x-protobuf
Content-Length: 1243
  1. Additional Information: The python app is a langflow app instrumented as
    ag.init()         
    OllamaInstrumentor().instrument()
    OpenAIInstrumentor().instrument()
    QdrantInstrumentor().instrument()
    LangchainInstrumentor().instrument()
dosubot[bot] commented 1 hour ago

Hello @ar-or 👋 I'm here to help you with any bugs, questions, or contribution inquiries you may have.

The error you're encountering seems to be related to two main issues in the agenta-backend:

  1. UUID Parsing Error: The error message ValueError: badly formed hexadecimal UUID string indicates that the parse_ingest_value function is trying to convert a string to a UUID, but the string is not in the correct format. This suggests that the data being sent might contain an invalid UUID string. You should verify that all UUIDs in your trace data are correctly formatted before they are sent to the backend.

  2. Dictionary Size Change During Iteration: The RuntimeError: dictionary changed size during iteration suggests that the code is attempting to modify a dictionary while iterating over it. This can happen if the dictionary is being altered (e.g., adding or removing keys) during the iteration process. To resolve this, ensure that any modifications to the dictionary are done outside of the iteration loop. You might need to create a copy of the dictionary keys before iterating over them to avoid this error.

To address these issues, you should:

These steps should help in resolving the 500 Internal Server Error you're experiencing with the OTLP traces.

To continue talking to Dosu, mention @dosu.