Closed lambda7xx closed 2 months ago
why is there the wrird llm call?
Hey @lambda7xx! Great to see you diving deep into the mysteries of LlamaIndex again. Hope you're doing well!
Based on the information you've provided, it seems like the "weird" LLM calls you're seeing are part of the multi-step query process. The LlamaIndex framework uses a MultiStepQueryEngine
to handle complex queries that require multiple steps of reasoning or decomposition. This engine works by iteratively decomposing and processing each step of the query until a stopping condition is met.
The "weird" prompts you're seeing are likely generated by the query_transform
function, which is responsible for transforming an original query into a new query that can be answered with the given context and, if applicable, previous reasoning steps. This transformation process is crucial for generating prompts for Large Language Model (LLM) calls in a way that optimizes the use of available information and facilitates the extraction of relevant answers.
For example, the prompt you provided:
"The original question is as follows: Based on the abstract of "Llama 2: Open Foundation and Fine-Tuned Chat Models," what are the two primary objectives achieved in this work, and what is the range of parameters for the large language models developed? We have an opportunity to answer some, or all of the question from a knowledge source. Context information for the knowledge source is provided below, as well as previous reasoning steps. Given the context and previous reasoning, return a question that can be answered from the context. This question can be the same as the original question, or this question can represent a subcomponent of the overall question. It should not be irrelevant to the original question. If we cannot extract more information from the context, provide 'None' as the answer. Some examples are given below: ..."
This is likely a result of the query_transform
function decomposing the original query into a new query that can be answered with the given context and previous reasoning steps. The function is designed to iteratively refine queries based on the context and previous reasoning, enabling the generation of LLM prompts that are more likely to yield relevant and accurate answers.
The "normal" LLM calls you're seeing are likely the result of the complete
function in the base.py
file of the LlamaIndex codebase. This function constructs a payload with the provided prompt, model details, and additional options. It then makes a POST request to the Ollama API's /api/generate
endpoint with this payload. Upon receiving a successful response, it extracts the text from the response and returns a CompletionResponse
object containing the text, the raw response, and any additional keyword arguments excluding "response".
If you're finding that the "weird" LLM calls are causing issues with your multi-step queries, you might consider reviewing and possibly refining the logic in the messages_to_prompt
function to ensure it accurately constructs the context for multi-step queries. You could also experiment with different context_window
sizes to find an optimal balance that allows the model to maintain context without being overwhelmed by irrelevant information. Lastly, you could evaluate the training data and capabilities of the "mistralai/Mistral-7B-Instruct-v0.1" model to ensure it aligns with the types of queries and the domain of the dataset you are working with.
I hope this helps clarify the behavior you're seeing. If you have any more questions or need further clarification, please don't hesitate to ask.
i think the below is not related to my query.
Question: How many Grand Slam titles does the winner of the 2020 Australian Open have?
Knowledge source context: Provides names of the winners of the 2020 Australian Open
Previous reasoning: None
Next question: Who was the winner of the 2020 Australian Open?
Question: Who was the winner of the 2020 Australian Open?
Knowledge source context: Provides names of the winners of the 2020 Australian Open
Previous reasoning: None.
New question: Who was the winner of the 2020 Australian Open?
Question: How many Grand Slam titles does the winner of the 2020 Australian Open have?
Knowledge source context: Provides information about the winners of the 2020 Australian Open
Previous reasoning:
- Who was the winner of the 2020 Australian Open?
- The winner of the 2020 Australian Open was Novak Djokovic.
New question: None
Question: How many Grand Slam titles does the winner of the 2020 Australian Open have?
Knowledge source context: Provides information about the winners of the 2020 Australian Open - includes biographical information for each winner
Previous reasoning:
- Who was the winner of the 2020 Australian Open?
- The winner of the 2020 Australian Open was Novak Djokovic.
New question: How many Grand Slam titles does Novak Djokovic have?
Question: Based on the abstract of "Llama 2: Open Foundation and Fine-Tuned Chat Models," what are the two primary objectives achieved in this work, and what is the range of parameters for the large language models developed?
Knowledge source context: None
Previous reasoning: None
New question: [/INST] </s>
It seems like your LLM just barfed while generating sub-queries (this "odd" query is a refine step, but the input to the refine step is part of the prompt for generating sub-queries?)
Its hard to know exactly where in your pipeline this happened without stepping through the code in a debugger
It seems like your LLM just barfed while generating sub-queries (this "odd" query is a refine step, but the input to the refine step is part of the prompt for generating sub-queries?)
Its hard to know exactly where in your pipeline this happened without stepping through the code in a debugger
thank you so much.
Question Validation
Question
I use the llama2 paper datasets. my code snippet is below
my original query is
Based on the abstract of "Llama 2: Open Foundation and Fine-Tuned Chat Models," what are the two primary objectives achieved in this work, and what is the range of parameters for the large language models developed?
I add print
lama_index/llms/huggingface/base.py
then i found there are 7 llm call. there are 3 weird llm call and 4 normal weird llm call. The log of weird llm call
for the weird llm call, it seems the prompt is not related to the original query.
for the normal llm call, the log is very normal and the prompt is related to the original query. It retrieve the data related to the original query.