Chia-Hsuan-Lee / KaggleDBQA

Introduction page of a challenging text-to-SQL dataset: KaggleDBQA
Other
32 stars 5 forks source link

ambiguous questions #3

Open YoungJaeChoung opened 3 months ago

YoungJaeChoung commented 3 months ago

I think some questions have alternative queries.


1.


    • file name: GeoNuclearData.json

3.


4.


5.

Chia-Hsuan-Lee commented 3 months ago

Hello! Indeed, in text-to-SQL benchmarks, it is not uncommon to have multiple valid SQLs for a question. And typically, during annotation process, humans couldn't list out all possible SQLs. To resolve this issue, I would suggest take a look at evaluation methods other than Exact Match. For example, execution accuracy by BIRD-SQL.

Thanks for pointing this out anyways!

YoungJaeChoung commented 3 months ago

Hello! Indeed, in text-to-SQL benchmarks, it is not uncommon to have multiple valid SQLs for a question. And typically, during annotation process, humans couldn't list out all possible SQLs. To resolve this issue, I would suggest take a look at evaluation methods other than Exact Match. For example, execution accuracy by BIRD-SQL.

Thanks for pointing this out anyways!

Thank you for sharing the paper. I will read it. :)