truera / trulens

Evaluation and Tracking for LLM Experiments
https://www.trulens.org/
MIT License
2.07k stars 181 forks source link

Getting issues in using Trulens with Langchain Agents like ReAct/Structured Chat #1133

Closed deepak-nigam-pepcus closed 1 month ago

deepak-nigam-pepcus commented 4 months ago

I am facing issues in evaluating the bot results/responses through the Trulens when using langchain agents (like ReAct/Structured Chat).

Previously I was using rag-chain, which contains simple prompts to LLM and able to run the trulens with this, however on using rag-chain, the groundedness (which is an indication of hallucinations) was only 18%.

In order to decrease the hallucinations, I moved towards the ReAct approach, it reduced the hallucinations, as verified by manual observations, but it caused integration with Trulens difficult. I am getting the following error:

======================================================================================

Entering new AgentExecutor chain... Could not find usage information in openai response: <openai.Stream object at 0x000001D7CAE6E7A0> Could not find usage information in openai response: <openai.Stream object at 0x000001D7CAE6E7A0> Could not find usage information in openai response: <openai.Stream object at 0x000001D7CAE6E7A0> Could not find usage information in openai response: <openai.Stream object at 0x000001D7CAE6E7A0> Could not find usage information in openai response: <openai.Stream object at 0x000001D7CAE6E7A0> Could not find usage information in openai response: <openai.Stream object at 0x000001D7CAE6E7A0> Could not find usage information in openai response: <openai.Stream object at 0x000001D7CAE6E7A0> Could not find usage information in openai response: <openai.Stream object at 0x000001D7CAE6E7A0> Could not find usage information in openai response: <openai.Stream object at 0x000001D7CAE6E7A0> Could not find usage information in openai response: <openai.Stream object at 0x000001D7CAE6E7A0> Could not find usage information in openai response: <openai.Stream object at 0x000001D7CAE6E7A0> Could not find usage information in openai response: <openai.Stream object at 0x000001D7CAE6E7A0> Error calling wrapped function plan. Traceback (most recent call last): File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\trulens_eval\instruments.py", line 503, in tru_wrapper rets, cost = mod_endpoint.Endpoint.track_all_costs_tally( File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\trulens_eval\feedback\provider\endpoint\base.py", line 494, in track_all_costs_tally result, cbs = Endpoint.track_all_costs( File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\trulens_eval\feedback\provider\endpoint\base.py", line 475, in track_all_costs return Endpoint._track_costs( File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\trulens_eval\feedback\provider\endpoint\base.py", line 572, in _track_costs result: T = __func(*args, kwargs) File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\langchain\agents\agent.py", line 397, in plan for chunk in self.runnable.stream(inputs, config={"callbacks": callbacks}): File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\langchain_core\runnables\base.py", line 2875, in stream yield from self.transform(iter([input]), config, kwargs) File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\langchain_core\runnables\base.py", line 2862, in transform yield from self._transform_stream_with_config( File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\langchain_core\runnables\base.py", line 1880, in _transform_stream_with_config chunk: Output = context.run(next, iterator) # type: ignore File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\langchain_core\runnables\base.py", line 2826, in _transform for output in final_pipeline: File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\langchain_core\runnables\base.py", line 1283, in transform for chunk in input: File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\langchain_core\runnables\base.py", line 4722, in transform yield from self.bound.transform( File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\langchain_core\runnables\base.py", line 1300, in transform yield from self.stream(final, config, **kwargs) File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\langchain_core\language_models\chat_models.py", line 245, in stream raise e File "c:\Users\admin\learning\AI\rag\chatbot-api\venv\lib\site-packages\langchain_core\language_models\chat_models.py", line 237, in stream assert generation is not None AssertionError

I tried various methods, explored on the internet for any suggestions but everyone is giving examples with rag chain only. I am trying to resolve it ASAP, but just wanted to update you about the same.

trulens_error2 TruLensErrror1

dosubot[bot] commented 4 months ago

The error messages and issues you're encountering with Trulens and Langchain agents like ReAct/Structured Chat seem to stem from discrepancies in handling the OpenAI response, particularly regarding the expected usage data. Here's a focused approach to troubleshoot and potentially resolve these issues:

  1. Check OpenAI Response Structure: Ensure the OpenAI response includes a usage field. If the response structure has changed, you may need to adjust how you access this data in your code.

  2. Response Format Changes: If OpenAI's response format has evolved (e.g., from dicts to structured data), make sure your code correctly handles these changes to maintain compatibility with Langchain.

  3. Instrumentation of OpenAI Methods: Verify that your code is correctly instrumenting OpenAI methods for usage and cost tracking. Any changes in OpenAI's API could affect this process.

  4. Update Response Handling: Review and possibly update how your code processes OpenAI responses, especially the extraction of usage data, to align with any new response formats.

  5. Enhance Logging: Increase logging detail to capture more information about the OpenAI responses. This can help identify where the discrepancies might be occurring.

  6. Review Langchain Callbacks and Schema: Ensure that Langchain's callback handlers and schema are up-to-date with any changes in how LLM results are processed or expected.

  7. Test with Known Good Responses: Testing with previously working responses can help determine if the issue lies with the response processing logic.

By focusing on these areas, you should be better positioned to diagnose and fix the integration issues between Trulens and Langchain agents. If these steps do not resolve the problem, providing more detailed error messages, stack traces, and code snippets could help further diagnose the issue.

#### Details **Sources** - [trulens_eval/trulens_eval/feedback/provider/endpoint/openai.py](https://github.com/truera/trulens/blob/main/trulens_eval/trulens_eval/feedback/provider/endpoint/openai.py) - [trulens_eval/examples/expositional/frameworks/langchain/langchain_agents.ipynb](https://github.com/truera/trulens/blob/main/trulens_eval/examples/expositional/frameworks/langchain/langchain_agents.ipynb) - [trulens_eval/trulens_eval/feedback/provider/endpoint/litellm.py](https://github.com/truera/trulens/blob/main/trulens_eval/trulens_eval/feedback/provider/endpoint/litellm.py) - [trulens_eval/trulens_eval/tru_chain.py](https://github.com/truera/trulens/blob/main/trulens_eval/trulens_eval/tru_chain.py) - [trulens_eval/trulens_eval/utils/imports.py](https://github.com/truera/trulens/blob/main/trulens_eval/trulens_eval/utils/imports.py) - [docs/trulens_eval/tracking/instrumentation/langchain.ipynb](https://github.com/truera/trulens/blob/main/docs/trulens_eval/tracking/instrumentation/langchain.ipynb)
**About Dosu** This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a šŸ‘ if this is helpful and šŸ‘Ž if it is irrelevant.

To continue the conversation, mention @dosu.

joshreini1 commented 4 months ago

@deepak-nigam-pepcus can you share some example code for your agent setup that we can use to reproduce?

piotrm0 commented 4 months ago

Hi @deepak-nigam-pepcus . I replicated a bug that looks similar. I have a temporary fix in this PR: https://github.com/truera/trulens/pull/1138 . Note however, that cost tracking (in terms of token counts and USD total) will not work while we try to find a more permenent solution the underlying problem.

piotrm0 commented 4 months ago

The error messages and issues you're encountering with Trulens and Langchain agents like ReAct/Structured Chat seem to stem from discrepancies in handling the OpenAI response, particularly regarding the expected usage data. Here's a focused approach to troubleshoot and potentially resolve these issues:

  1. Check OpenAI Response Structure: Ensure the OpenAI response includes a usage field. If the response structure has changed, you may need to adjust how you access this data in your code.
  2. Response Format Changes: If OpenAI's response format has evolved (e.g., from dicts to structured data), make sure your code correctly handles these changes to maintain compatibility with Langchain.
  3. Instrumentation of OpenAI Methods: Verify that your code is correctly instrumenting OpenAI methods for usage and cost tracking. Any changes in OpenAI's API could affect this process.
  4. Update Response Handling: Review and possibly update how your code processes OpenAI responses, especially the extraction of usage data, to align with any new response formats.
  5. Enhance Logging: Increase logging detail to capture more information about the OpenAI responses. This can help identify where the discrepancies might be occurring.
  6. Review Langchain Callbacks and Schema: Ensure that Langchain's callback handlers and schema are up-to-date with any changes in how LLM results are processed or expected.
  7. Test with Known Good Responses: Testing with previously working responses can help determine if the issue lies with the response processing logic.

By focusing on these areas, you should be better positioned to diagnose and fix the integration issues between Trulens and Langchain agents. If these steps do not resolve the problem, providing more detailed error messages, stack traces, and code snippets could help further diagnose the issue.

Details

To continue the conversation, mention @dosu.

Good response for trulens devs but not for the end user who raised the issue.

dosubot[bot] commented 4 months ago

Given the complexity of the issue and the temporary fix provided in the pull request, it's important to integrate this fix into your environment to see if it resolves the error you're encountering with Trulens and Langchain agents. Here's a step-by-step guide to help you apply the temporary solution and further diagnose the problem if needed:

  1. Apply the Temporary Fix: Follow the instructions or changes outlined in the pull request. This might involve updating certain files or dependencies in your project to include the temporary fix.

  2. Re-test Your Setup: After applying the fix, re-run your agent setup to check if the error persists. This will help determine if the temporary fix addresses the issue with handling the OpenAI response and usage data.

  3. Monitor for Updates: Keep an eye on the pull request and any related discussions for updates on a permanent solution to the problem. Updates might include additional fixes or instructions on how to modify your code.

  4. Provide Feedback: If the temporary fix resolves your issue, consider providing feedback in the discussion or pull request. If you continue to experience problems, sharing detailed error messages, stack traces, and the specific code that's causing the issue can be invaluable for diagnosing and resolving the problem.

  5. Check for Official Updates: Regularly check for official updates or patches that address the integration issue permanently. Applying these updates promptly will ensure your setup remains functional and efficient.

By following these steps, you should be able to mitigate the issue temporarily and contribute to finding a permanent solution.

#### Details **About Dosu** This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a šŸ‘ if this is helpful and šŸ‘Ž if it is irrelevant.

To continue the conversation, mention @dosu.

Ritesh-2001 commented 4 months ago

@deepak-nigam-pepcus Have you found a solution for this, I too am facing issues in using trulens with ReAct agent. Could You share the syntax for defining metrics like context_relevance?