Closed utility-aagrawal closed 1 year ago
🤖
Hi there,
Thanks for reaching out and providing detailed information about your issue. It seems like you're experiencing some instability with the Llama-2 7B chat and NLSQLTableQueryEngine. Here are some suggestions that might help improve the reliability of your text-to-SQL queries:
from llama_index.indices.struct_store import NLSQLTableQueryEngine
query_engine = NLSQLTableQueryEngine(
sql_database=sql_database,
tables=["city_stats"],
)
from llama_index.objects import SQLTableNodeMapping, ObjectIndex, SQLTableSchema
from llama_index.indices.struct_store import SQLTableRetrieverQueryEngine
table_node_mapping = SQLTableNodeMapping(sql_database)
table_schema_objs = [(SQLTableSchema(table_name="city_stats")), ...] # one SQLTableSchema for each table
obj_index = ObjectIndex.from_objects(
table_schema_objs,
table_node_mapping,
VectorStoreIndex,
)
query_engine = SQLTableRetrieverQueryEngine(
sql_database, obj_index.as_retriever(similarity_top_k=1)
)
response = query_engine.query("Which city has the highest population?")
print(response)
# manually set extra context text
city_stats_text = (
"This table gives information regarding the population and country of a given city.\n"
"The user will query with codewords, where 'foo' corresponds to population and 'bar'"
"corresponds to city."
)
table_node_mapping = SQLTableNodeMapping(sql_database)
table_schema_objs = [(SQLTableSchema(table_name="city_stats", context_str=city_stats_text))]
Use the correct prompt templates: In the context shared, it's shown that the NLSQLTableQueryEngine and SQLTableRetrieverQueryEngine use specific prompt templates for text-to-SQL conversion and response synthesis. Make sure you're using the correct prompt templates for your use case.
Consider using PGVectorSQLQueryEngine: In the context shared, it's shown that there is a PGVectorSQLQueryEngine, a modified version of the normal text-to-SQL query engine that can infer embedding vectors in the SQL query. This is a beta feature and could potentially improve the reliability of your text-to-SQL queries.
Remember to always test your changes to ensure they are improving the reliability of the system.
Unfortunately, I'm unable to view the text file you attached as I can't see URLs. However, I hope the above suggestions will help you improve the reliability of your text-to-SQL queries. If you have any further questions or need more clarification, feel free to ask.
Best, Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
I have already tried applicable steps and results are still the same. Can you provide more pointers?
Question Validation
Question
I want to implement a text-to-SQL capability using a local model. I am using Llama-2 7B chat with NLSQLTableQueryEngine but the output is unstable:
Case 1: Produces wrong output for the right query: query_str = ( "Can you tell me number of countries in each region?" ) Output: Sure! Based on the question you provided, the SQL query that answers it is: SELECT regions.region_name, COUNT(*) as num_countries FROM regions JOIN countries ON regions.region_id = countries.region_id The results of the query are: [('Americas', 25)] This means that there are 25 countries in the Americas region.
Case 2: Produces correct output for a query with typo: query_str = ( "Can you tell me number of counbtries in each region?" ) Output: Sure! Based on the SQL query you provided, here is the response for the question "Can you tell me the number of countries in each region?": Americas | 5 Asia | 6 Europe | 8 Middle East and Africa | 6
What is your advice to make this work reliably?
Attaching the notebook as a text file for your reference. text2sql_with_local_model.txt
Let me know if you have any questions.