truera / trulens

Evaluation and Tracking for LLM Experiments
https://www.trulens.org/
MIT License
2.07k stars 181 forks source link

[BUG] Error removing trivial statements: invalid syntax (<string>, line 1). Proceeding with all statements. && UserWarning: No supporting evidence provided. Returning score only. warnings.warn( #1505

Open zhongshuai-cao opened 5 days ago

zhongshuai-cao commented 5 days ago

Bug Description When I run the evaluation it gave me both warning messages and error messages.

Warning: UserWarning: No supporting evidence provided. Returning score only. Error: Error removing trivial statements: invalid syntax (, line 1). Proceeding with all statements.

To Reproduce I was following the example to set up llama-index query engine using OpenAI API and Azure AI Search

Relevant Logs/Tracebacks from langchain_community.callbacks.openai_info import OpenAICallbackHandler You can use the langchain cli to automatically upgrade many imports. Please see documentation here https://python.langchain.com/v0.2/docs/versions/v0_2/ from langchain.callbacks.openai_info import OpenAICallbackHandler 🦑 TruSession initialized with db url sqlite:////Users/xxx/xxx/trulens.db . 🛑 Secret keys may be written to the database. See the database_redact_keys option of TruSession to prevent this. ✅ In Groundedness, input source will be set to record.app.query.rets.source_nodes[:].node.text.collect() . ✅ In Groundedness, input statement will be set to record.main_output or Select.RecordOutput . ✅ In Answer Relevance, input prompt will be set to record.main_input or Select.RecordInput . ✅ In Answer Relevance, input response will be set to record.main_output or Select.RecordOutput . ✅ In Context Relevance, input question will be set to record.main_input or Select.RecordInput . ✅ In Context Relevance, input context will be set to record.app.query.rets.source_nodes[:].node.text . Processing questions: 1%|█ | 1/128 [00:09<20:41, 9.77s/it]/opt/homebrew/Caskroom/miniforge/base/envs/env_name/lib/python3.11/site-packages/trulens/feedback/llm_provider.py:289: UserWarning: No supporting evidence provided. Returning score only. warnings.warn( Processing questions: 2%|██▏ | 2/128 [00:24<26:20, 12.54s/it]Error removing trivial statements: invalid syntax (, line 1). Proceeding with all statements. Processing questions: 2%|███▎ | 3/128 [00:28<18:00, 8.64s/it]Error removing trivial statements: invalid syntax (, line 1). Proceeding with all statements. Processing questions: 3%|████▎ | 4/128 [00:40<20:32, 9.94s/it]Error removing trivial statements: invalid syntax (, line 1). Proceeding with all statements. Processing questions: 4%|█████▍ | 5/128 [00:51<21:44, 10.61s/it]Error removing trivial statements: invalid syntax (, line 1). Proceeding with all statements. Processing questions: 10%|██████████████ | 13/128 [02:20<21:02, 10.98s/it]Error removing trivial statements: invalid syntax (, line 1). Proceeding with all statements. Error removing trivial statements: invalid syntax (, line 1). Proceeding with all statements. Processing questions: 11%|███████████████ | 14/128 [02:32<21:55, 11.54s/it]Error removing trivial statements: invalid syntax (, line 1). Proceeding with all statements. Processing questions: 12%|████████████████▏ | 15/128 [02:41<19:52, 10.55s/it]Error removing trivial statements: invalid syntax (, line 1). Proceeding with all statements.

Environment:

sfc-gh-jreini commented 6 hours ago

Hey @zhongshuai-cao - what LLM are you using for your feedback provider? The failures here are a result of the LLM provider failing to follow instructions

Also can you share the text you're evaluating that leads to this error? This will help us debug