vanna-ai / vanna

🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.
https://vanna.ai/docs/
MIT License
11.03k stars 860 forks source link

How to guide the AI shot at correct table? #439

Closed ray-linn closed 4 months ago

ray-linn commented 4 months ago

Describe the bug I created many views as flat tables and hope this could simplify the table relationship, but as a result, there are many duplicated field existing in tables, and sometime AI will generate SQL over wrong table, and lost the result.

To Reproduce Steps to reproduce the behavior:

  1. I defines a lookup table for working shift . ddl as following .
    vn.train(ddl='''
        --- A lookup table to query shift,start time and end time over company ---
        CREATE TABLE [HumanResources].[Shift] (
        [ShiftID]      TINYINT      IDENTITY (1, 1) NOT NULL,
        [Name]         NVARCHAR (50) NOT NULL,
        [StartTime]    TIME (7)     NOT NULL,
        [EndTime]      TIME (7)     NOT NULL,
        [ModifiedDate] DATETIME     NOT NULL);
        ''')
  2. in the View ".vEmployeeDepartmentHistory", there is a duplicated field "shift" (inner join to Shift table), because I do not like expose inner join to Vanna, I list the DDL as following.
       self.vn.train(ddl='''
.....
            [Shift]            NVARCHAR (50) NOT NULL,
.....
        ''')
  1. i aslo add documents to tell LLM, the [shift] here is assigned to employee 'Field Shift of table HumanResources.vEmployeeDepartmentHistory is the shift that assigned to an individual employee。'

  2. A simple question is asked - vn.ask(question='''What is the shift duration for each shift?''')

Expected behavior I suppose it will select something from HumanResources.Shift table, however , incorrect SQL is generated as "

SELECT Name, datediff(hour, StartTime, EndTime, this could be any of the following fields: DepartmentID, DepartmentName, or DepartmentName of the employee."...

Error logs/Screenshots SELECT Name, datediff(hour, StartTime, EndTime, this could be any of the following fields: DepartmentID, DepartmentName, or DepartmentName of the employee."...

Desktop (please complete the following information where): Vanna 0.5.4 Model CodeGemma, WizardCoder,

Additional context Once i removed the "shift" from ".vEmployeeDepartmentHistory" DDL, everything is ok.