Closed Kirushikesh closed 1 year ago
🤖
Hello @Kirushikesh,
I'm Dosu, a friendly bot here to assist you with LlamaIndex. I can help you troubleshoot issues, answer your questions, and guide you on how to become a contributor. While we wait for a human maintainer, consider me your go-to resource.
I'm currently reviewing your issue and will get back to you with a comprehensive response shortly.
🤖
Based on the error message you provided, it seems like the output from the CustomLLM
class's _call
method is not being correctly parsed into a JSON format. The SelectionOutputParser
class, which is used to parse the output, is raising a ValueError
because it's unable to convert the output to JSON.
The SelectionOutputParser
class expects the output to be in a specific format that can be parsed into a JSON object. If the output is not in the expected format, the parse
method of the SelectionOutputParser
class will raise a ValueError
.
In your case, the output from the CustomLLM
class's _call
method seems to be a string, which is not in the expected format. The SelectionOutputParser
class is trying to parse this string into a JSON object, but it's failing because the string is not a valid JSON string.
To resolve this issue, you should ensure that the output from the CustomLLM
class's _call
method is in the expected format. The expected format is a JSON serializable object. If the output is a string, you should convert it into a JSON serializable object before returning it from the _call
method.
Here's an example of how you can modify the _call
method to return a JSON serializable object:
def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
prompt_length = len(prompt)
response = pipeline(prompt, max_new_tokens=256)[0]["generated_text"]
# only return newly generated tokens
response = response[prompt_length:]
# convert the response string into a JSON serializable object
response_json = {"response": response}
return response_json
In this example, the response
string is wrapped in a dictionary with a single key-value pair. The key is "response"
and the value is the response
string. This dictionary is a JSON serializable object, so it can be correctly parsed by the SelectionOutputParser
class.
Please try this modification and let me know if it resolves your issue. If not, please provide more information about the output from the CustomLLM
class's _call
method, and I'll be happy to help further.
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
@dosu-bot, thanks for the response. I changed the CustomLLM class implementation and getting a new error of "ValidationError: 1 validation error for Generation text str type expected (type=type_error.str)"
Code changed is this:
class CustomLLM(LLM):
# def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
# prompt_length = len(prompt)
# response = pipeline(prompt, max_new_tokens=256)[0]["generated_text"]
# # only return newly generated tokens
# return response[prompt_length:]
def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
prompt_length = len(prompt)
response = pipeline(prompt, max_new_tokens=256)[0]["generated_text"]
# only return newly generated tokens
response = response[prompt_length:]
# convert the response string into a JSON serializable object
response_json = {"response": response}
return response_json
@property
def _identifying_params(self) -> Mapping[str, Any]:
return {"name_of_model": model_name}
@property
def _llm_type(self) -> str:
return "custom"
Output when running query_engine.query("Tell me about the Ian Crocker and his 100 meters records")
is as follows:
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
Cell In[18], line 1
----> 1 query_engine.query(
2 "Tell me about the Ian Crocker and his 100 meters records"
3 )
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/llama_index/indices/query/base.py:23, in BaseQueryEngine.query(self, str_or_query_bundle)
21 if isinstance(str_or_query_bundle, str):
22 str_or_query_bundle = QueryBundle(str_or_query_bundle)
---> 23 return self._query(str_or_query_bundle)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/llama_index/query_engine/router_query_engine.py:146, in RouterQueryEngine._query(self, query_bundle)
142 def _query(self, query_bundle: QueryBundle) -> RESPONSE_TYPE:
143 with self.callback_manager.event(
144 CBEventType.QUERY, payload={EventPayload.QUERY_STR: query_bundle.query_str}
145 ) as query_event:
--> 146 result = self._selector.select(self._metadatas, query_bundle)
148 if len(result.inds) > 1:
149 responses = []
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/llama_index/selectors/types.py:76, in BaseSelector.select(self, choices, query)
74 metadatas = [_wrap_choice(choice) for choice in choices]
75 query_bundle = _wrap_query(query)
---> 76 return self._select(choices=metadatas, query=query_bundle)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/llama_index/selectors/llm_selectors.py:91, in LLMSingleSelector._select(self, choices, query)
88 choices_text = _build_choices_text(choices)
90 # predict
---> 91 prediction = self._llm_predictor.predict(
92 prompt=self._prompt,
93 num_choices=len(choices),
94 context_list=choices_text,
95 query_str=query.query_str,
96 )
98 # parse output
99 assert self._prompt.output_parser is not None
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/llama_index/llm_predictor/base.py:181, in LLMPredictor.predict(self, prompt, output_cls, **prompt_args)
179 formatted_prompt = prompt.format(llm=self._llm, **prompt_args)
180 formatted_prompt = self._extend_prompt(formatted_prompt)
--> 181 response = self._llm.complete(formatted_prompt)
182 output = response.text
184 logger.debug(output)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/llama_index/llms/base.py:312, in llm_completion_callback.<locals>.wrap.<locals>.wrapped_llm_predict(_self, *args, **kwargs)
302 with wrapper_logic(_self) as callback_manager:
303 event_id = callback_manager.on_event_start(
304 CBEventType.LLM,
305 payload={
(...)
309 },
310 )
--> 312 f_return_val = f(_self, *args, **kwargs)
313 if isinstance(f_return_val, Generator):
314 # intercept the generator and add a callback to the end
315 def wrapped_gen() -> CompletionResponseGen:
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/llama_index/llms/langchain.py:61, in LangChainLLM.complete(self, prompt, **kwargs)
59 @llm_completion_callback()
60 def complete(self, prompt: str, **kwargs: Any) -> CompletionResponse:
---> 61 output_str = self._llm.predict(prompt, **kwargs)
62 return CompletionResponse(text=output_str)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/llms/base.py:907, in BaseLLM.predict(self, text, stop, **kwargs)
905 else:
906 _stop = list(stop)
--> 907 return self(text, stop=_stop, **kwargs)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/llms/base.py:867, in BaseLLM.__call__(self, prompt, stop, callbacks, tags, metadata, **kwargs)
860 if not isinstance(prompt, str):
861 raise ValueError(
862 "Argument `prompt` is expected to be a string. Instead found "
863 f"{type(prompt)}. If you want to run the LLM on multiple prompts, use "
864 "`generate` instead."
865 )
866 return (
--> 867 self.generate(
868 [prompt],
869 stop=stop,
870 callbacks=callbacks,
871 tags=tags,
872 metadata=metadata,
873 **kwargs,
874 )
875 .generations[0][0]
876 .text
877 )
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/llms/base.py:647, in BaseLLM.generate(self, prompts, stop, callbacks, tags, metadata, run_name, **kwargs)
632 raise ValueError(
633 "Asked to cache, but no cache found at `langchain.cache`."
634 )
635 run_managers = [
636 callback_manager.on_llm_start(
637 dumpd(self),
(...)
645 )
646 ]
--> 647 output = self._generate_helper(
648 prompts, stop, run_managers, bool(new_arg_supported), **kwargs
649 )
650 return output
651 if len(missing_prompts) > 0:
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/llms/base.py:535, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs)
533 for run_manager in run_managers:
534 run_manager.on_llm_error(e)
--> 535 raise e
536 flattened_outputs = output.flatten()
537 for manager, flattened_output in zip(run_managers, flattened_outputs):
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/llms/base.py:522, in BaseLLM._generate_helper(self, prompts, stop, run_managers, new_arg_supported, **kwargs)
512 def _generate_helper(
513 self,
514 prompts: List[str],
(...)
518 **kwargs: Any,
519 ) -> LLMResult:
520 try:
521 output = (
--> 522 self._generate(
523 prompts,
524 stop=stop,
525 # TODO: support multiple run managers
526 run_manager=run_managers[0] if run_managers else None,
527 **kwargs,
528 )
529 if new_arg_supported
530 else self._generate(prompts, stop=stop)
531 )
532 except BaseException as e:
533 for run_manager in run_managers:
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/llms/base.py:1048, in LLM._generate(self, prompts, stop, run_manager, **kwargs)
1042 for prompt in prompts:
1043 text = (
1044 self._call(prompt, stop=stop, run_manager=run_manager, **kwargs)
1045 if new_arg_supported
1046 else self._call(prompt, stop=stop, **kwargs)
1047 )
-> 1048 generations.append([Generation(text=text)])
1049 return LLMResult(generations=generations)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/load/serializable.py:97, in Serializable.__init__(self, **kwargs)
96 def __init__(self, **kwargs: Any) -> None:
---> 97 super().__init__(**kwargs)
98 self._lc_kwargs = kwargs
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/pydantic/main.py:341, in pydantic.main.BaseModel.__init__()
ValidationError: 1 validation error for Generation
text
str type expected (type=type_error.str)
🤖
The error you're encountering is due to the return type of the _call
method in your CustomLLM
class. The _call
method is expected to return a string, but in your implementation, it's returning a dictionary (response_json
). This is causing the ValidationError
you're seeing, as the LlamaIndex framework is expecting a string type response from the _call
method.
To resolve this issue, you should modify the _call
method to return a string. If you want to keep the response in a JSON format, you can convert the dictionary to a string using json.dumps()
. Here's how you can modify your _call
method:
import json
class CustomLLM(LLM):
def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
prompt_length = len(prompt)
response = pipeline(prompt, max_new_tokens=256)[0]["generated_text"]
# only return newly generated tokens
response = response[prompt_length:]
# convert the response string into a JSON serializable object
response_json = {"response": response}
# convert the JSON object to a string
response_str = json.dumps(response_json)
return response_str
@property
def _identifying_params(self) -> Mapping[str, Any]:
return {"name_of_model": model_name}
@property
def _llm_type(self) -> str:
return "custom"
This way, the _call
method will return a string that represents a JSON object, which should resolve the ValidationError
you're encountering.
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
Thanks Again, @dosu-bot for the response, some improvement now i am getting a new error after updating the code given by you. The error is "KeyError: 'choice'"
The Error Log when running the code query_engine.query("Tell me about the Ian Crocker and his 100 meters records")
is as follows::
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[30], line 1
----> 1 query_engine.query(
2 "Tell me about the Ian Crocker and his 100 meters records"
3 )
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/llama_index/indices/query/base.py:23, in BaseQueryEngine.query(self, str_or_query_bundle)
21 if isinstance(str_or_query_bundle, str):
22 str_or_query_bundle = QueryBundle(str_or_query_bundle)
---> 23 return self._query(str_or_query_bundle)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/llama_index/query_engine/router_query_engine.py:146, in RouterQueryEngine._query(self, query_bundle)
142 def _query(self, query_bundle: QueryBundle) -> RESPONSE_TYPE:
143 with self.callback_manager.event(
144 CBEventType.QUERY, payload={EventPayload.QUERY_STR: query_bundle.query_str}
145 ) as query_event:
--> 146 result = self._selector.select(self._metadatas, query_bundle)
148 if len(result.inds) > 1:
149 responses = []
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/llama_index/selectors/types.py:76, in BaseSelector.select(self, choices, query)
74 metadatas = [_wrap_choice(choice) for choice in choices]
75 query_bundle = _wrap_query(query)
---> 76 return self._select(choices=metadatas, query=query_bundle)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/llama_index/selectors/llm_selectors.py:100, in LLMSingleSelector._select(self, choices, query)
98 # parse output
99 assert self._prompt.output_parser is not None
--> 100 parse = self._prompt.output_parser.parse(prediction)
101 return _structured_output_to_selector_result(parse)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/llama_index/output_parsers/selection.py:98, in SelectionOutputParser.parse(self, output)
95 raise ValueError(f"Failed to convert output to JSON: {output!r}")
97 json_output = self._format_output(json_obj)
---> 98 answers = [Answer.from_dict(json_dict) for json_dict in json_output]
99 return StructuredOutput(raw_output=output, parsed_output=answers)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/llama_index/output_parsers/selection.py:98, in <listcomp>(.0)
95 raise ValueError(f"Failed to convert output to JSON: {output!r}")
97 json_output = self._format_output(json_obj)
---> 98 answers = [Answer.from_dict(json_dict) for json_dict in json_output]
99 return StructuredOutput(raw_output=output, parsed_output=answers)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/dataclasses_json/api.py:70, in DataClassJsonMixin.from_dict(cls, kvs, infer_missing)
65 @classmethod
66 def from_dict(cls: Type[A],
67 kvs: Json,
68 *,
69 infer_missing=False) -> A:
---> 70 return _decode_dataclass(cls, kvs, infer_missing)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/dataclasses_json/core.py:171, in _decode_dataclass(cls, kvs, infer_missing)
168 if not field.init:
169 continue
--> 171 field_value = kvs[field.name]
172 field_type = types[field.name]
173 if field_value is None:
KeyError: 'choice'
🤖
From the error log you've provided, it seems like the error is not actually a KeyError: 'choice'
, but rather a ValidationError
from Pydantic, which is used for data validation in LlamaIndex. The error message indicates that a string was expected for the 'text' field, but it wasn't provided.
Looking at your custom LLM code, it seems like the issue might be with the format of the response you're returning from the _call
method. You're returning a JSON string, but it seems like LlamaIndex might be expecting a Python dictionary instead.
Here's the relevant part of your code:
# convert the JSON object to a string
response_str = json.dumps(response_json)
return response_str
You might want to try returning the response_json
dictionary directly, like this:
return response_json
This will return a Python dictionary, which should be compatible with the data validation in LlamaIndex.
If this doesn't solve the issue, it would be helpful if you could provide more information about the data you're loading into the index and the exact query you're running. This would help to further diagnose the issue.
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
No. I tried it already. I guess this is some key error receiving from the LLMSelector Api for selecting the choice/tools based on the query. This is the changed class implemented as suggested by you
import json
class CustomLLM(LLM):
def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
prompt_length = len(prompt)
response = pipeline(prompt, max_new_tokens=256)[0]["generated_text"]
# only return newly generated tokens
response = response[prompt_length:]
# convert the response string into a JSON serializable object
response_json = {"response": response}
# convert the JSON object to a string
response_str = json.dumps(response_json)
return response_str
@property
def _identifying_params(self) -> Mapping[str, Any]:
return {"name_of_model": model_name}
@property
def _llm_type(self) -> str:
return "custom"
And i am getting the KeyError: 'choice'
when trying to query from the router engine. Please recheck the answer once again.
@Kirushikesh I kind of skipped what the bot was saying haha but to the original error, this is very common with open-source LLMS.
Relying on open-source LLMs to produce structured outputs is flakey at best, and likely requires fine-tuning in my experience.
There's not much we can do if the LLM itself isn't outputting valid JSON or following the expected JSON schema.
I know zeyphr-beta was pretty decent in my experience. Otherwise you need a 30B+ parameter model tbh.
@logan-markewich thanks for the clarifications, just some questions instead of using a router engine if i used "query engine directly from vector store index" than i can able to work with the same model without any errors. I thought Router Engine in LlamaIndex is only compatible with OpenAI models, but i was even tried it with Llama 70b-chat model still i was getting the same error. Just for my own clarification, is it the issue with the model not some compatibility issue of "LLMSelector()/RouterEngine()"?
Yea it's just an issue with the model, it's very hard to get open source models to output structured outputs these days.
We need to look into integrations like outlines to make this better I think
Thanks again @logan-markewich for the response. Really Appreciate.
I also face the same error.
I try to replicate the code from the course "Building Agentic RAG with LlamaIndex" from deeplearning.ai. I used Tiny-LLM from this repo instead of an OpenAI model. After I had done debugging, I added a new function to llama_index.core.output_parsers.utils.py
as ChatGPT suggested 😄
def convert_to_json(response):
summary_start = response.find("- Answer:") + len("- Answer: ")
summary = response[summary_start:].strip()
return json.dumps({"summary": summary})
Then, I used that function instead of _marshal_llm_to_json
in llama_index.core.output_parsers.selection.py
.
However, I got the error as you did.
KeyError: 'choice'
Is there any way to solve this issue? I really appreciate your reply. Thanks.
Bug Description
I was trying to build a QA engine on SQL Database and Unstructured text, since Llama-index provides the connectivity Huggingface LLMs. I went with using that, when performing the inference on a given query using RouterQueryEngine, i am getting ValueError: Failed to convert output to JSON regardless of the query.
Version
0.8.49
Steps to Reproduce
Attaching the code snippet:
Relevant Logs/Tracbacks