gamkers / studentbae_AI

8 stars 2 forks source link

LLM Chat with SQL Database #1

Open PKNaveen opened 2 months ago

PKNaveen commented 2 months ago

As the title suggests, Langchain using any of the common LLM models uses the following prompt

You are an agent designed to interact with a SQL database.
Given an input question, create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.
Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most {top_k} results.
You can order the results by a relevant column to return the most interesting examples in the database.
Never query for all the columns from a specific table, only ask for the relevant columns given the question.
You have access to tools for interacting with the databas

đź’ˇSolution for this would be to limit tables that contain only relevant information that can be used, no private data such as names, addresses, social security numbers etc. This could be used as an storage DB for students having their PDF in one place. Access to this should be read-only . There is already an existing system that uses this process called Vanna AI (Link below) but we will have to test this for our use case.

Let me know your thoughts on this @gamkers @VISWANATH78

Resources: SQL Agents Vanna AI SQL Agent

VISWANATH78 commented 2 months ago

prompting will work but the thing is the way of fetching is still an question @PKNaveen . Like it will query the database but for the specific data it will be tough to obtain . i mean lets consider i need sum of alll the element in the column a . it needs to obtain the coloumn and then perform sum . it will write the sql query for it and it will work . but if i ask provide me unique questions from the provided data and categorize them as we ask to chatgpt. it wont be able to query it. the best way is to train the model with necessary source by auto encoder and store them as an group entity cluster like "graph rag " methodology. which i know the theory dont know to code how to perform. where we can co-rrelate each and every words meaning and obtain exact data. Do tell me what we can do to optimize this process if i am not wrong .

PKNaveen commented 2 months ago

Isn't graph Rag for Graph database. This would not be feasible for relational db? @VISWANATH78