taoyds / spider

scripts and baselines for Spider: Yale complex and cross-domain semantic parsing and text-to-SQL challenge
https://yale-lily.github.io/spider
Apache License 2.0
848 stars 193 forks source link

Empty databases #14

Closed nweir127 closed 5 years ago

nweir127 commented 5 years ago

Hello, it seems that a number of the sqlite3 files provided are completely unpopulated. Is this by design? It doesn't seem to be based on the train/dev/test split.

Example empty datasets: geo car_1 wta_1

nweir127 commented 5 years ago

On a similar note, the geo schema is missing a primary key and foreign key for the lake table (line 43715). There are also errors in the formula_1 schema. Could you verify that each of the schemas provided in tables.json is correct?

taoyds commented 5 years ago

Hi,

Thanks! We updated the following 7 sqlite files in database/ dir (these SQLite files were created based on several files. ) Please download spider data from the official website again. car_1, wine_1, student_1, csu_1, inn_1, flight_2, formula_1

For the databases below, since they are traditional datasets created by others and do not have real database published online. Therefore, we only provide their schemas.

geo yelp scholar restaurants Advising imdb academic

taoyds commented 5 years ago

For the schema issue, all schemas in tables.json were extracted from the databases. You can send me your specific findings via email if you found some serous problems. We will update them together via the next release.