wrongly parsed samples in released data from spider website

Hi Tao, In the released dataset, queries with more than two "table_units" are not parsed correctly. For example: In dev.json line 7431. "query": "SELECT count(*) FROM student AS T1 JOIN has_pet AS T2 ON T1.stuid = T2.stuid JOIN pets AS T3 ON T2.petid = T3.petid WHERE T1.sex = 'F' AND T3.pettype = 'dog'" The 'from' part of the above query is parsed to: "from": {"conds": [ [ false, 2, [ 0, [ 0, 1, false ], null ], [ 0, 9, false ], null ] ], "table_units": [ [ "table_unit", 0 ], [ "table_unit", 1 ] ] } Meanwhile, the script named parse_sql_one.py would give the correct parse: 'from': {'table_units': [('table_unit', 0), ('table_unit', 1), ('table_unit', 2)], 'conds': [(False, 2, (0, (0, 1, False), None), (0, 9, False), None), 'and', (False, 2, (0, (0, 10, False), None), (0, 11, False), None)]}

I wonder if you have done the baseline experiments using the released dataset.

Thanks, Yibo

taoyds / spider

wrongly parsed samples in released data from spider website #6