Open diicellman opened 6 months ago
@diicellman I saw exactly the same regression in llama-index myself. I haven't had time to dive deep into this one but I personally have pinned llma-index to 0.10.19
for the time being. We will investigate further and get back to you but they soft-deprecated callbacks in 0.10.20
- my guess is that this is causing problems.
Thank you for the reply! I was also thinking that llama-index's callbacks updates could be the cause of the issue.
So far we've reproduced orphaned chunking spans if streams are not consumed. This is caused because the chunking spans are emitted as soon as they are created but the overall trace is not "closed"
While not the same as above, it does give us some indication that traces are not shutting down as expected in some situations.
After some investigation we don’t think the callbacks are working as they previously did. We tried running the LlamaDebugHandler notebook (https://colab.research.google.com/github/jerryjliu/llama_index/blob/main/docs/examples/callbacks/LlamaDebugHandler.ipynb) and we don’t see any traces being printed anymore.
We are working with the llama-index team to resolve.
@diicellman I just hit this issue again with someone else and just want to double check one thing. It's imperative that instrument()
is called BEFORE any llama-index initialization happens. I'm not sure this will solve anything for you but just wanted to double check since this is after changing when instrumentation is called:
Before:
After:
Thank you for your help! I'm calling the instrument()
in the main.py fastapi file on application startup before any llamaindex calls.
I just updated the libraries to the latest version, and I'm encountering the same chunking traces rather than the full ones. I was considering that perhaps the problem is related to asynchronous usage. Both the endpoints in my app and the llama-index's query_engine.aquery()
are asynchronous. However, when I made them non-asynchronous, I still got the same chunking traces.
I just updated the libraries to the latest version, and I'm encountering the same chunking traces rather than the full ones. I was considering that perhaps the problem is related to asynchronous usage. Both the endpoints in my app and the llama-index's
query_engine.aquery()
are asynchronous. However, when I made them non-asynchronous, I still got the same chunking traces.
Oh interesting. Is your code on GitHub by chance? Would love to unblock you @diicellman
Yes, my code is on GitHub, but it's in a private repository. If you need to review the code, I can provide the crucial parts that are necessary. I'm sorry for any inconveniences this may cause.
Moving this to our backlog for now. We've communicated the lack of a trace tree in some contexts to llama-index and they are investigating.
We will probably fix this via the new instrumentation that is not callbacks
Describe the bug I have an instance of Arize Phoenix running in a Docker container. I've been using the instrument.py example for tracing previously, and there were no problems. Today, I pulled the latest Docker container (3.16.0), and now the tracing captures only "chunking" and nothing more. Previously, it captured the full trace for calling query engines, etc.
Here's my specs: python = 3.10 llama-index = "^0.10.19" openinference-semantic-conventions = "^0.1.5" openinference-instrumentation-llama-index = "^1.2.0" opentelemetry-exporter-otlp = "^1.23.0" llama-index-readers-telegram = "^0.1.4" llama-index-llms-anthropic = "^0.1.6" llama-index-callbacks-arize-phoenix = "^0.1.4" arize-phoenix = {extras = ["evals"], version = "^3.16.0"}
To Reproduce Steps to reproduce the behavior
Expected behavior To see full tracing
Screenshots This is what I get after running query_engine, only "chunking"
Desktop (please complete the following information):
Additional context Add any other context about the problem here.