defog-ai / sql-eval

Evaluate the accuracy of LLM generated outputs
Apache License 2.0
448 stars 47 forks source link

BUG run postgreSQL sqleval not work completely. #165

Open exceedzhang opened 3 weeks ago

exceedzhang commented 3 weeks ago

run postgreSQL sqleval not work completely.

image

Traceback (most recent call last): File "/Users/exceed/PycharmProjects/sql-eval/main.py", line 81, in run_openai_eval(args) File "/Users/exceed/PycharmProjects/sql-eval/eval/openai_runner.py", line 79, in run_openai_eval result_dict = f.result() File "/Users/exceed/miniconda3/envs/sql-eval/lib/python3.10/concurrent/futures/_base.py", line 451, in result return self.get_result() File "/Users/exceed/miniconda3/envs/sql-eval/lib/python3.10/concurrent/futures/_base.py", line 403, in get_result raise self._exception File "/Users/exceed/miniconda3/envs/sql-eval/lib/python3.10/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, **self.kwargs) File "/Users/exceed/PycharmProjects/sql-eval/query_generators/openai.py", line 143, in generate_query table_metadata_string = prune_metadata_str( File "/Users/exceed/PycharmProjects/sql-eval/utils/pruning.py", line 213, in prune_metadata_str table_metadata_csv = get_md_emb( File "/Users/exceed/PycharmProjects/sql-eval/utils/pruning.py", line 125, in get_md_emb column_type, col_desc = column_info.split("),", 1) ValueError: not enough values to unpack (expected 2, got 1)

wongjingping commented 3 weeks ago

@exceedzhang could you share with us the command you ran? We would need something reproducible to help you debug this issue.

exceedzhang commented 3 weeks ago

@wongjingping I simulated the GPT-3.5-turbo model using a local deployment model. export OPENAI_API_KEY="EMPTY" export OPENAI_BASE_URL="http://172.16.1.220:8000/v1/"

python main.py \ -db postgres \ -q "data/questions_gen_postgres.csv" \ -o results/openai.csv \ -g oa \ -f prompts/prompt_openai.md \ -m gpt-3.5-turbo-0125 \ -n 200 \ -p 5

wongjingping commented 3 weeks ago

Hi @exceedzhang, how are you setting up your local deployment model? Could you share those commands? The commands you shared above seemed to for testing with OpenAI's gpt-3.5 model. In such a case, I was able to run the code successfully:

python main.py \
-db postgres \
-q "data/questions_gen_postgres.csv" \
-o results/openai.csv \
-g oa \
-f prompts/prompt_openai.md \
-m gpt-3.5-turbo-0125 \
-n 200 \
-p 25
...
Correct so far: 147/200 (73.50%): 100%|████████████████████████████████████████████████████████████| 200/200 [00:13<00:00, 15.28it/s]
   query_category  num_rows  mean_correct  mean_error_db_exec
0  date_functions        25      0.640000            0.200000
1        group_by        35      0.828571            0.028571
2        instruct        35      0.800000            0.057143
3        order_by        35      0.885714            0.028571
4           ratio        35      0.485714            0.142857
5      table_join        35      0.742857            0.085714
Average correct rate: 0.73