Arize-ai / phoenix

AI Observability & Evaluation
https://docs.arize.com/phoenix
Other
4.02k stars 297 forks source link

[ENHANCEMENT] Fault tolerant data ingestion with tokens #4294

Open mikeldking opened 3 months ago

mikeldking commented 3 months ago

If you log a type other than a number for token counts the span ends up getting dropped:

Traceback (most recent call last):
  File "/phoenix/env/phoenix/db/bulk_inserter.py", line 200, in _insert_spans
    result = await insert_span(session, span, project_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/phoenix/env/phoenix/db/insertion/span.py", line 92, in insert_span
    cumulative_llm_token_count_completion += cast(int, accumulation[2] or 0)
TypeError: 'int' object is not iterable

This is confusing as it doesn't show up in the UI at all - we should be more tolerant of this and log clearer error messages.

RogerHYang commented 3 months ago

We also need to consider replicas: since the insertion failure is only known to one replica, users on other replicas will also need to be notified.