3_eval_baseline: ValueError: Expected tool_calls in ai_message.additional_kwargs, but none found.

jonmach commented 11 months ago

Running 3_eval_baseline.ipynb against a local OAI Server. The server returns a prompt fine, and there was no issue in the previous 2 labs.

from llama_index.llms import OpenAILike, LOCALAI_DEFAULTS
from langchain.embeddings import HuggingFaceInstructEmbeddings
from llama_index.embeddings import LangchainEmbedding

#print(LOCALAI_DEFAULTS)
MYLOCALAIDEFAULTS= {
    'api_key': 'localai_fake',
    'api_type': 'localai_fake', 
    'api_base': 'http://localhost:1234/v1'
}

openai.base_url='http://localhost:1234/v1'
openai.api_key='...'

#llm = ChatOpenAI(openai_api_key='...', model="text-davinci-003", max_tokens=1000, temperature=0.7)
llm = OpenAILike(
    **MYLOCALAIDEFAULTS,
    model="mymodel", 
    max_tokens=10000, 
    temperature=0.0
    )

# create a global service context
#service_context = ServiceContext.from_defaults(llm=OpenAI(model="gpt-3.5-turbo", temperature=0))
embed_model = LangchainEmbedding(HuggingFaceInstructEmbeddings(model_name = 'hkunlp/instructor-large'))

service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)
oai_service_context = service_context
set_global_service_context(service_context)

I built the indexes with no problem, and then cteated the Query Engine Tools as well as the Unified Query Engine.

However, when I tried to test the Query Engine, there seemed to be a problem within Llama Index that I think almost requires OpenAI as a source LLM.

When running

response = query_engine.query("How do I install llama index?")
print(str(response))

I get the following errors:```

ValueError Traceback (most recent call last) /Users/jon/dev/LLM/LLaMaIndex/llama_docs_bot/3_eval_baseline/3_eval_basline.ipynb Cell 16 line 1 ----> 1 response = query_engine.query("How do I install llama index?") 2 print(str(response))

File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/llama_index/core/base_query_engine.py:30, in BaseQueryEngine.query(self, str_or_query_bundle) 28 if isinstance(str_or_query_bundle, str): 29 str_or_query_bundle = QueryBundle(str_or_query_bundle) ---> 30 return self._query(str_or_query_bundle)

File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/llama_index/query_engine/sub_question_query_engine.py:132, in SubQuestionQueryEngine._query(self, query_bundle) 128 def _query(self, query_bundle: QueryBundle) -> RESPONSE_TYPE: 129 with self.callback_manager.event( 130 CBEventType.QUERY, payload={EventPayload.QUERY_STR: query_bundle.query_str} 131 ) as query_event: --> 132 sub_questions = self._question_gen.generate(self._metadatas, query_bundle) 134 colors = get_color_mapping([str(i) for i in range(len(sub_questions))]) 136 if self._verbose:

File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/llama_index/question_gen/openai_generator.py:88, in OpenAIQuestionGenerator.generate(self, tools, query) 85 tools_str = build_tools_text(tools) 86 query_str = query.query_str 87 question_list = cast( ---> 88 SubQuestionList, self._program(query_str=query_str, tools_str=tools_str) 89 ) 90 return question_list.items

File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/llama_index/program/openai_program.py:174, in OpenAIPydanticProgram.call(self, *args, **kwargs) 172 message = chat_response.message 173 if "tool_calls" not in message.additional_kwargs: --> 174 raise ValueError( 175 "Expected tool_calls in ai_message.additional_kwargs, " 176 "but none found." 177 ) 179 tool_calls = message.additional_kwargs["tool_calls"] 180 return _parse_tool_calls( 181 tool_calls, 182 output_cls=self.output_cls, 183 allow_multiple=self._allow_multiple, 184 verbose=self._verbose, 185 )

ValueError: Expected tool_calls in ai_message.additional_kwargs, but none found.


I've tried this with Python 3.12 and 3.11.

I know that the LLM is returning text, but this seems to be failing within Llama-Index.

[2023-12-07 15:49:38.873] [INFO] Generated prediction: {
  "id": "chatcmpl-w0vla2d5wefn40gg8bphe",
  "object": "chat.completion",
  "created": 1701964169,
  "model": "/Users/jon/.cache/lm-studio/models/TheBloke/Chronos-Hermes-13b-v2-GGUF/chronos-hermes-13b-v2.Q6_K.gguf",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": " 1. Getting Started: How can I install llama index on my machine?\n2. Getting Started: What are the system requirements for installing llama index?\n3. Getting Started: Can you provide step-by-step instructions for installing llama index?\n4. Getting Started: Is there a video tutorial available for installing llama index?\n5. Getting Started: How do I troubleshoot common installation issues with llama index?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 98,
    "total_tokens": 98
  }
}

Any thoughts?

ghbacct commented 10 months ago

Any ideas why this issue might be occurring? I've seen it happen with an Azure OpenAI model as well.

In my case this error seems to be caused by an intermittent failure of the model to use tools / function calls for structured output extraction using a Pydantic model.

jonmach commented 10 months ago

The problem seems to be due to llamaindex assuming OpenAI as the default LLM. I now pass the service_context to everything (e.g. query_engine, index creation) and it that has resolved the issue.

Aikoin commented 9 months ago

The problem seems to be due to llamaindex assuming OpenAI as the default LLM. I now pass the service_context to everything (e.g. query_engine, index creation) and it that has resolved the issue.

I met the same issue and I just solved it by adding the following line question_gen = LLMQuestionGenerator.from_defaults(service_context=service_context) its hard to notice question_gen!

Aikoin commented 9 months ago

Oops, this issue happened to me again :((( and the following is my code:

simple_tool = QueryEngineTool.from_defaults(
    query_engine=simple_query_engine,
    description="Useful when the query is relatively straightforward and can be answered with direct information retrieval, without the need for complex transformations.",
)

multi_step_tool = QueryEngineTool.from_defaults(
    query_engine=multi_step_query_engine,
    description="Useful when complex or multifaceted information needs are present, and a single query isn't sufficient to fully understand or retrieve the necessary information. This approach is especially beneficial in environments where the context evolves with each interaction or where the information is layered and requires iterative exploration.",
)

sub_question_tool = QueryEngineTool.from_defaults(
    query_engine=query_engine_sub_question,
    description="Useful when complex questions can be effectively broken down into simpler sub-questions, each of which can be answered independently. For example if you have to compare two ore more things.",
)

These query engines all worked fine. Below is the code that reported the error:

summarizer = TreeSummarize(
    service_context=service_context_openchat,
)
query_engine = RouterQueryEngine(
    selector=PydanticSingleSelector.from_defaults(llm=openchat),
    query_engine_tools=[
        simple_tool,
        multi_step_tool,
        sub_question_tool,
    ],
    service_context = service_context_openchat,
    summarizer=summarizer,
)
response_1 = query_engine.query("What is Nicolas Cage's profession?")
print(str(response_1))

I can't find where the error is

rookie-littleblack commented 9 months ago

Try this:

from llama_index.question_gen.llm_generators import LLMQuestionGenerator
question_gen = LLMQuestionGenerator.from_defaults(service_context=service_context)

query_engine = SubQuestionQueryEngine.from_defaults(
    question_gen=question_gen,  # <<< add this line, otherwise it will use OpenAIQuestionGenerator!
    query_engine_tools=query_engine_tools,
    use_async=True,
)

Aikoin commented 9 months ago

Try this:

from llama_index.question_gen.llm_generators import LLMQuestionGenerator
question_gen = LLMQuestionGenerator.from_defaults(service_context=service_context)

query_engine = SubQuestionQueryEngine.from_defaults(
    question_gen=question_gen,  # <<< add this line, otherwise it will use OpenAIQuestionGenerator!
    query_engine_tools=query_engine_tools,
    use_async=True,
)

many thanks! I've already found where the problem lay: ''Pydantic selectors (currently only supported by gpt-4-0613 and gpt-3.5-turbo-0613 (the default))'' since my model is openchat, I changed the selector to LLM selectors and it worked! :))))

run-llama / llama_docs_bot

3_eval_baseline: ValueError: Expected tool_calls in ai_message.additional_kwargs, but none found. #5