h2oai / sql-sidekick

Experiment on QnA tabular data using LLMs and SQL
Apache License 2.0
23 stars 2 forks source link
genai genai-usecase llm llm-chain llm-framework llm-tools llm2sql sql-query txt2sql txt2sql-python-cli wave

sql-sidekick

A simple SQL assistant (WIP) Turn ★ into ⭐ (top-right corner) if you like the project! 🙏

Motivation

Key Features

Installation

Requirements

This project requires Python version to be within the range "3.8.1" to "3.10.0". You can check your Python version by running the following command in your terminal:

python --version

If your Python version is not within the specified range, you may need to update or downgrade it.

Dev

1. git clone git@github.com:h2oai/sql-sidekick.git
2. cd sql-sidekick
3. make setup
4. source ./.sidekickvenv/bin/activate
5. poetry install (in case there is an error, try `poetry update` before `poetry install`)
6. python sidekick/prompter.py

Usage

Dialect: postgres
- docker pull postgres (will pull the latest version)
- docker run --rm --name pgsql-dev -e POSTGRES_PASSWORD=abc -p 5432:5432 postgres

Default: sqlite
Step:
- Download and install .whl --> s3://sql-sidekick/releases/sql_sidekick-0.0.3-py3-none-any.whl
- python3 -m venv .sidekickvenv
- source .sidekickvenv/bin/activate
- python3 -m pip install sql_sidekick-0.0.3-py3-none-any.whl

Start

`sql-sidekick`

Welcome to the SQL Sidekick! I am an AI assistant that helps you with SQL
queries. I can help you with the following:
  0. Generate input schema:
  `sql-sidekick configure generate_schema configure generate_schema --data_path "./sample_passenger_statisfaction.csv" --output_path "./table_config.jsonl"`

  1. Configure a local database(for schema validation and syntax checking):
  `sql-sidekick configure db-setup -t "<local_dir_path_to_>/table_info.jsonl"` (e.g., format --> https://github.com/h2oai/sql-sidekick/blob/main/examples/telemetry/table_info.jsonl)

  2. Ask a question: `sql-sidekick query -q "avg Gpus" -s "<local_dir_path_to_>/samples.csv"` (e.g., format --> https://github.com/h2oai/sql-sidekick/blob/main/examples/telemetry/samples.csv)

  3. Learn contextual query/answer pairs: `sql-sidekick learn add-samples` (optional)

  4. Add context as key/value pairs: `sql-sidekick learn update-context` (optional)

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  configure  Helps in configuring local database.
  learn      Helps in learning and building memory.
  query      Asks question and returns SQL

UI

Steps to start locally

  1. Download wave serve 0.26.3
  2. tar -xzf wave-0.26.3-linux-amd64; ./waved -max-request-size="20M"
  3. Download the latest bundle: https://github.com/h2oai/sql-sidekick/releases/latest
  4. unzip ai.h2o.wave.sql-sidekick.x.x.x.wave
  5. make setup
  6. source ./.sidekickvenv/bin/activate
  7. make run Screen Shot 2023-11-15 at 6 19 14 PM

Citation & Acknowledgment

Please consider citing our project if you find it useful:

Blogs:

https://medium.com/the-story-within/state-of-text-to-sql-dc3e3e4f8c64

@software{sql-sidekick,
    title = {{sql-sidekick: A simple SQL assistant}},
    author = {Pramit Choudhary, Michal Malohlava, Narasimha Durgam, Robin Liu, h2o.ai Team}
    url = {https://github.com/h2oai/sql-sidekick},
    year = {2024}
}

LLM frameworks adopted: h2ogpt, h2ogpte, LangChain, llama_index, openai