BeachWang / DAIL-SQL

A efficient and effective few-shot NL2SQL method on GPT-4.
Apache License 2.0
386 stars 65 forks source link

Questions about execution accuracy in spider-dev dataset and DAIL-SQL #39

Open BeiwenZhang opened 3 months ago

BeiwenZhang commented 3 months ago

I am interested in your research and admire your state-of-the-art results, but I have two questions:

First, I tested with "--selector_type EUCDISMASKPRESKLSIMTHR" and gpt-3.5-turbo. These are the results. I don’t understand why the execution accuracy is so low (72.3%). Could you please help me with this problem?

PS C:\Users\86158\Desktop\text2sql\test-suite-sql-eval-master\test-suite-sql-eval-master> python evaluation.py --gold dev_gold.txt --pred RESULTS_MODEL-gpt-3.5-turbo.txt --db C:\Users\86158\Desktop\text2sql\test-suite-sql-eval-master\test-suite-sql-eval-master\database --etype exec
OK easy medium hard extra all count 248 446 174 166 1034 ===================== EXECUTION ACCURACY ===================== execution 0.883 0.771 0.661 0.422 0.723

Second, in the paper titled "DAIL-SQL," you use cosine similarity, but the algorithm in the corresponding code (EUCDISMASKPRESKLSIMTHR) uses Euclidean distance. Am I choosing the wrong algorithm?

BeachWang commented 1 month ago

Hi,

Thank you for your interest in our work. When you tested gpt-3.5-turbo using "--selector_type EUCDISMASKPRESKLSIMTHR", did you set "--pre_test_result" to results/graphix_result.txt? In our paper, the preliminary model selected for the experiment is Graphix. Additionally, in our paper, we mention that cosine similarity and Euclidean distance are both optional, and for the experiment, we chose Euclidean distance.

Sherlocktein commented 1 month ago

How do you verify the answer?