awslabs / unified-text2sql-benchmark

UNITE: A Unified Benchmark for Text-to-SQL Evaluation
https://arxiv.org/abs/2305.16265
Apache License 2.0
50 stars 1 forks source link

What format should the predicted query output by LLM use? #8

Open Nutingnon opened 1 week ago

Nutingnon commented 1 week ago

Hi, I note that the processed file dev.jsonl has the following format: {"db_id": "1-10015132-11", "question": "What position does the player who played for butler cc (ks) play?", "query": "SELECT position FROM toronto_raptors_all_time_roster_l WHERE school_club_team = \"Butler CC (KS)\""}

This format is different from the required format of the tool test-suite-sql-eval that you mentioned in your README.md. For example, one case in gold.txt of the test-suite-sql-eval is: SELECT * FROM AIRLINES flight_2

You can see the difference, which means the tool cannot directly applied to the dev.jsonl. Can you please provide more details that how should I organize the queries predicted by LLM and then how to do the execute-accuracy calculation?

Thank you for your help and look forward to your response.

lanwuwei commented 1 week ago

Hello,

The processed format is not for evaluation directly, but should be easy to convert.

The file format for evaluation is shown in the test-suite-sql-eval, here is the link, for example, gold file has line format a gold SQL \t db_id, prediction file just has one predicted SQL for one line.

Note: just use exec for evaluation.