ibm-granite-community / pm

Granite Community Project Management
0 stars 0 forks source link

Text To SQL recipe #7

Open adampingel opened 4 months ago

adampingel commented 4 months ago

Objectives

As a potential user of Granite Code, I will be able to experiment hands-on with the Granite Code's text to sql capability, and to get some initial concrete details of how a production deployment that leverages this capability could be built.

This was touted in a June 1 blog entry: https://research.ibm.com/blog/granite-LLM-text-to-SQL, which cites the BIRD leaderboard https://bird-bench.github.io/ ("BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation) represents a pioneering, cross-domain dataset that examines the impact of extensive database contents on text-to-SQL parsing. BIRD contains over 12,751 unique question-SQL pairs, 95 big databases with a total size of 33.4 GB")

The workflow should

The textual context should

Existing Prompting Service

From a slide on IBM Flowpilot SQL Prompting Service for Granite:

The SQL prompting library allows to create custom prompts in a fast and intuitive way such that it is easy to experiment with new prompts

It is consumed in 3 ways:

The library can make use of various granite-based models for schema linking, content linking and SQL generation.

The library can connect to databases to obtain schemas and sample values to be included in the prompt.

UI Considerations (Not in Scope)

If we build a demonstration LangChain.JS + LangServe app to demonstrate this, we should consider:

Test Cases

In addition to the BIRD benchmark cited above, some other schema for testing might include:

Airlines

Stocks

Org

Acceptance Criteria

Assumptions, Open Questions, and Potential Complications

Other References

fayvor commented 3 months ago

Digging into this.

fayvor commented 3 months ago

Some additional databases, under the MIT license: https://github.com/lerocha/chinook-database https://github.com/microsoft/sql-server-samples/tree/master