taoyds / spider

scripts and baselines for Spider: Yale complex and cross-domain semantic parsing and text-to-SQL challenge
https://yale-lily.github.io/spider
Apache License 2.0
848 stars 193 forks source link

Equivalence of SQL queries #1

Closed namednil closed 6 years ago

namednil commented 6 years ago

Hi,

this corpus looks great and I hope it will encourage people to do interesting research! I'm wondering if you considered an evaluation based on proving equivalence of SQL queries, which can be done automatically by Cosette?

Best, Matthias

taoyds commented 6 years ago

Hi, Matthias,

We write a script to parse SQL queries into different smaller clauses/components and then evaluate each clause (by set, the order doesn't matter) separately. SQL queries are equivalent if their all clauses are the same. You can find more details on this page: https://github.com/taoyds/spider/tree/master/evaluation_examples .

The tool mentioned by you looks pretty cool. Unfortunately, we didn't know it when we were working on this work. We definitely would like to check it out in the future release. Also, it would be great if anyone could contribute. However, our evaluation script works well on the current Spider dataset.

Thanks!

====followup updates======

Hi, Matthias,

Did you use Cosette before? What evaluation accuracy could this tool get? It's possible to replace execution accuracy with its results if it is reliable.

Best, Tao