I couldn't find any measure of query difficulty in the Spider dataset

taoyds / spider

scripts and baselines for Spider: Yale complex and cross-domain semantic parsing and text-to-SQL challenge

https://yale-lily.github.io/spider

Apache License 2.0

848 stars 193 forks source link

I couldn't find any measure of query difficulty in the Spider dataset #83

Open Atlamtiz opened 1 year ago

Atlamtiz commented 1 year ago

"I didn't find any difficulty metric in Spider, but in the latest paper, Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for Text-to-SQL Parsing， I saw that they distinguished different difficulty levels. However, there seems to be no difficulty measurement in the dataset. Why is that?"

hanlinGao commented 1 year ago

I have the same confusion. I classified the samples in dev.json based on the criteria defined in README, but when I input my results in the evaluation.py, I found the evaluation.py will classify the gold.txt into different hardness and the classification results were slightly different from mine.

So I went to the evaluation.py and used the counting functions there instead to classify the samples

wanjianwei commented 1 year ago

iamlockelightning commented 1 year ago

👀