Open jgpruitt opened 5 months ago
Hope Embedding pg_catalo can improve text to sql be more nice that traditional zeroshot or one-shot mod.
I based on zero shot idea develop a shiny llm app. It has 5 steps from convert catalog to final get data.
For example, I want to know "find the top 5 sold music name and total revenue and quantity as well as average unit price", the ai will generate sql for me basedon schema and question, I extract data using the ai sql.
Below is the detail steps.
convert the pg_catalog into sql-ddl sentence, we can
"select
'CREATE TABLE ' || nspname || '.' || relname || ' (' || chr(10)||
array_to_string(
array_agg(attname ||
' ' ||
atttypid::regtype::text||
CASE attnotnull WHEN true THEN ' NOT NULL' ELSE ' NULL' end ||
coalesce(' -- COMMENT ' || quote_literal(description), '')||
',' || chr(10)
), ' '
) || chr(10)|| ');' as definition
from
pg_attribute
join
pg_class on
pg_class.oid = pg_attribute.attrelid
join
pg_namespace on
pg_namespace.oid = pg_class.relnamespace
left join
pg_description on
pg_description.objoid = pg_class.oid
and pg_description.objsubid = pg_attribute.attnum
where
nspname not in ('pg_catalog', 'information_schema')
and relkind in ('r', 'v')
group by
nspname, relname;"
to get dbms name, as this is only postgres. we can say it is constranst
system prompt for text to sql
/* 1. Given the following database schema: */
{sql_ddl}
/* 2. Answer user question followed :
{user_question}
/* 3. Follow below SQL code standard
- 3.1. To begin with, check if the question can be answered by SQL, return explanation about how to change question.
- 3.1.1 explanation should be comments with SQL single line comments format!!!
- 3.2. If the question could be answer by SQL, use following format:
- 3.2.1 Prefer to use common table expression.
- 3.2.2 Only use the tables and fields defined in the database schema
- 3.2.3 Generate single table query when possible.
- 3.2.4 Only left join tables which have same field name and nor more than 3 tables join
- 3.2.5 If necessary,explaination code geneation with SQL single line comments format
- 3.2.6 the DBMS is {dbms_name}. if DBMS is sqlite, pls only use SQL-92 standard.
- 3.2.7 the SQL code should be ending with a semicolon.
*/
=======================================================
Attache my shiny_llm_map git link and demo line for your reference
Experiment with text-to-sql from within the database. Can we embed the pg_catalog and use it to power text-to-sql?