Closed Vishaal-Kareti closed 1 year ago
query_toks_no_value
is an original field in the spider's dataset, but we did not use it in our project.
Here is the first example in the spider's training set:
{
"db_id": "department_management",
"query": "SELECT count(*) FROM head WHERE age > 56",
"query_toks": [
"SELECT",
"count",
"(",
"*",
")",
"FROM",
"head",
"WHERE",
"age",
">",
"56"
],
"query_toks_no_value": [
"select",
"count",
"(",
"*",
")",
"from",
"head",
"where",
"age",
">",
"value"
],
"question": "How many heads of the departments are older than 56 ?",
"question_toks": [
"How",
"many",
"heads",
"of",
"the",
"departments",
"are",
"older",
"than",
"56",
"?"
],
"sql": {
"from": {
"table_units": [
[
"table_unit",
1
]
],
"conds": []
},
"select": [
false,
[
[
3,
[
0,
[
0,
0,
false
],
null
]
]
]
],
"where": [
[
false,
3,
[
0,
[
0,
10,
false
],
null
],
56.0,
null
]
],
"groupBy": [],
"having": [],
"orderBy": [],
"limit": null,
"intersect": null,
"union": null,
"except": null
}
}
Our code only uses db_id
, query
, and question
fields. Therefore, you don't need to think about how to generate query_toks_no_value.
Oh, got it. Thank you for the clarification!
I'm attempting to try training on another dataset by appending the dataset's train.json, dev.json, and tables.json to Spider's (and adding the database into Spider's too) with RESDSQL. I'm a bit stumped on how to generate the query_toks_no_value from a query. Is there a script for this, or do you have any advice on how to make one?