taolusi / chisp

scripts and baselines for CSpider: Chinese semantic parsing and text-to-SQL challenge
https://taolusi.github.io/CSpider-explorer/
162 stars 18 forks source link

data description #11

Closed liweigu closed 3 years ago

liweigu commented 3 years ago

Could you provider detailed data description for cpider (train/val) json data's format? As: 1) The description from spider (https://github.com/taoyds/spider/blob/master/preprocess/parsed_sql_examples.sql) is incomplete, for example, there is no detailed description for "where", which must be analyzed from other codes. 2) In the code at https://github.com/taolusi/chisp/blob/master/preprocess_data.py , it contains OLD_WHERE_OPS and NEW_WHERE_OPS, which represent two indexes. Is cspider data consistent with OLD_WHERE_OPS, which is also consistent with spider? And is NEW_WHERE_OPS just used in the baseline model provided by https://github.com/taolusi/chisp/ ?

taolusi commented 3 years ago

@liweigu Sorry for the late response. For the first question, we do not have a detailed description of the dataset, you may look for that in the spider dataset or it could be better to analyze from corresponding codes. For the second question, the preprocess script is the same with Spider.

liweigu commented 3 years ago

I got it. Thanks.