Open imoneoi opened 10 months ago
Hi @imoneoi , thanks for your interest on our work! Sure I'd like to share the dataset generation script. I use the script at https://github.com/microsoft/Table-Pretraining/tree/main/data_generator to synthesize the dataset. I'm still trying to build one clean repo to synthesize SQL queries from any table in the csv format - but it may still require some time 😂
Can you share your dataset generation script for symbolic SQL data? I found some invalid SQL and wanted to improve it.
There are spaces in table column names, which is invalid, as shown in the example below.