Closed mdciri closed 3 weeks ago
its would be a huge refactor to bubble up this information, fyi
My recommendation is just using an observability tool, or writing your own event handler to look at LLM events
For example, you could access and record response.raw by extending this example slightly https://docs.llamaindex.ai/en/stable/examples/instrumentation/observe_api_calls/
Going to close this for now, using the instrumentation is the preferred way to access this right now
Feature Description
I would like to keep the
raw
information when callingclient.invoke_model(..., trace="ENABLED")
.I created a a class
MyBedrock
, subcalss ofBedrock
that in the methodcomplete
has:So, when i do:
I get:
So, in
res.raw
are reported not only the number of tokens in input and output, but also all the decisions and detections made by the AWS guardrail.Unfortunately when I created a RAG using
RetrieverQueryEngine
, I get with the methodquery
using he same query:and the previous information
res.raw
is completely lost.It would be nice to keep track of this to diagnose and debug better the reasons why the guardrail blocked or what masked, etc. Moreover, it is possible to track the cost of the model used in Bedrock.
Thanks in advance
Reason
It would be nice to keep track of this to diagnose and debug better the reasons why the guardrail blocked or what masked, etc. Moreover, it is possible to track the cost of the model used in Bedrock.
I have not tried any other provider yet, but I believe other providers returns something similar to that
res.raw
, so the feature requested could be extended to all the provider available inllama-index
.Value of Feature
The model evaluation will be more complete.