Questions about execution accuracy in spider-dev dataset and DAIL-SQL

BeiwenZhang commented 5 months ago

I am interested in your research and admire your state-of-the-art results, but I have two questions:

First, I tested with "--selector_type EUCDISMASKPRESKLSIMTHR" and gpt-3.5-turbo. These are the results. I don’t understand why the execution accuracy is so low (72.3%). Could you please help me with this problem?

PS C:\Users\86158\Desktop\text2sql\test-suite-sql-eval-master\test-suite-sql-eval-master> python evaluation.py --gold dev_gold.txt --pred RESULTS_MODEL-gpt-3.5-turbo.txt --db C:\Users\86158\Desktop\text2sql\test-suite-sql-eval-master\test-suite-sql-eval-master\database --etype exec
OK easy medium hard extra all count 248 446 174 166 1034 ===================== EXECUTION ACCURACY ===================== execution 0.883 0.771 0.661 0.422 0.723

Second, in the paper titled "DAIL-SQL," you use cosine similarity, but the algorithm in the corresponding code (EUCDISMASKPRESKLSIMTHR) uses Euclidean distance. Am I choosing the wrong algorithm?

BeachWang commented 4 months ago

Hi,

Thank you for your interest in our work. When you tested gpt-3.5-turbo using "--selector_type EUCDISMASKPRESKLSIMTHR", did you set "--pre_test_result" to results/graphix_result.txt? In our paper, the preliminary model selected for the experiment is Graphix. Additionally, in our paper, we mention that cosine similarity and Euclidean distance are both optional, and for the experiment, we chose Euclidean distance.

Sherlocktein commented 3 months ago

How do you verify the answer?

oslijunw commented 2 weeks ago

@BeachWang 为什么这边sql预生成用graphix，是因为速度比较快？要不然用llm不就好了

BeachWang / DAIL-SQL

Questions about execution accuracy in spider-dev dataset and DAIL-SQL #39