[Question]: Gemma Issue with "query_engine = SQLAutoVectorQueryEngine(sql_tool, vector_tool )"

afshinebtia commented 6 months ago

Question Validation

[X] I have searched both the documentation and discord for an answer.

Question

Hi everyone,

I need assistance with my implementation issue and would appreciate it if someone could help me.

I have tried to implement the tutorial of retrieving data from SQL and Wikipedia (). The code works without any issues with GPT models.

When using LLM = Gemma and Embedding model = local:BAAI/bge-small-en-v1.5 in my Jupyter notebook, the model can generate appropriate responses for different questions by looking at individual SQL/Wikipedia data, regardless of whether I use the SQL query engine or the vector index query engine separately.

However, when I use query_engine = SQLAutoVectorQueryEngine(sql_tool, vector_tool ) to retrieve data from both data sources for the query of response = query_engine.query("Tell me about the arts and culture of the city with the highest population"), I get the following error:

JSONDecodeError Traceback (most recent call last) File c:\Users.conda\envs\llamaindex_py3.10\lib\site-packages\llama_index\core\output_parsers\selection.py:75, in SelectionOutputParser.parse(self, output) 74 try: ---> 75 json_obj = json.loads(json_string) 76 except json.JSONDecodeError as e_json:

File c:\Users.conda\envs\llamaindex_py3.10\lib\json__init__.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw) 343 if (cls is None and object_hook is None and 344 parse_int is None and parse_float is None and 345 parse_constant is None and object_pairs_hook is None and not kw): --> 346 return _default_decoder.decode(s) 347 if cls is None:

File c:\Users.conda\envs\llamaindex_py3.10\lib\json\decoder.py:340, in JSONDecoder.decode(self, s, _w) 339 if end != len(s): --> 340 raise JSONDecodeError("Extra data", s, end) 341 return obj

JSONDecodeError: Extra data: line 7 column 1 (char 210)

During handling of the above exception, another exception occurred:

ScannerError Traceback (most recent call last) File c:\Users.conda\envs\llamaindex_py3.10\lib\site-packages\llama_index\core\output_parsers\selection.py:84, in SelectionOutputParser.parse(self, output) ... { "choice": 2, "reason": "The question is about the arts and culture of a city, so the most relevant choice is (2) Useful for answering semantic questions about different cities." } ]

dosubot[bot] commented 6 months ago

🤖

Based on the error messages you provided, it seems like there might be an issue with the format of the data being processed. The JSONDecodeError and ScannerError typically occur when the data being parsed is not properly formatted for the expected data type or when there's an issue with the data encoding.

To resolve the JSONDecodeError, ensure that any JSON data being passed to or from the SQL and Wikipedia data retrieval processes is correctly formatted. You can use tools like JSONLint to validate your JSON data. Also, consider adding try-except blocks around your JSON parsing code to handle any exceptions and help identify the exact point of failure.

The ScannerError is a common error when parsing YAML files and indicates that the YAML parser encountered an issue. If YAML is involved in the configuration or data handling process, verify that any YAML files or strings are correctly formatted. Use a YAML validator to check the syntax. Also, ensure that no tabs are used for indentation in YAML content, as YAML requires spaces for indentation.

Since the errors are happening in the selection.py file, it would be helpful to have more specific information about the operations being performed in this file when the errors occur. If the errors are happening due to specific operations not detailed in your message (like parsing responses from an API or reading configuration files), you would need to apply error handling and validation at those points.

Additionally, ensure that all external data sources, such as responses from Wikipedia or SQL query results, are correctly formatted as JSON before further processing. Implementing robust error handling around data parsing and conversion operations will help in quickly identifying and resolving these issues.

As for the SQLAutoVectorQueryEngine, it is designed to query both SQL databases and vector stores. It intelligently decides whether to query the SQL database or the vector store based on the input. If it opts to query the SQL database, it can also augment the retrieved results with information from the vector store. This augmentation is facilitated by the VectorIndexAutoRetriever. You can find more details about this in the source code.

I hope this helps! If you continue to experience issues, please provide more details about the specific operations being performed when the errors occur, and I'll be glad to assist further.

Sources

#### About Dosu

This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.

logan-markewich commented 6 months ago

This is because this specific feature requires the LLM to output a structured JSON, and it seems like gemma did not do this (this is common for open-source LLMs)

afshinebtia commented 6 months ago

This is because this specific feature requires the LLM to output a structured JSON, and it seems like gemma did not do this (this is common for open-source LLMs)

Thanks @logan-markewich for your response. So is there any solution to modify this for Gemma?

logan-markewich commented 6 months ago

Prompt engineering? Fine tuning?

afshinebtia commented 6 months ago

Prompt engineering? Fine tuning?

Of course they could be some alternatives, however, I need to use RAG in the context of my work.

Btw, do you have any idea why local models such as Gemma are sooo slow in generating responses from indexes? The gpt3.5 generates in a few seconds, however, for Gemma and Llama it takes like 1 hour!

Arsive02 commented 4 months ago

Hi @afshinebtia , you can check out https://github.com/guidance-ai/guidance

afshinebtia commented 4 months ago

Hi @afshinebtia , you can check out https://github.com/guidance-ai/guidance

Hi @Arsive02, Thanks for your reply. I will look at it. This repo seems fantastic!

run-llama / llama_index