Open liaoxingjian opened 1 year ago
Hi, Thank you for your helpful suggestion! The reason we used Dr. Spider as the test set is because we wanted to investigate the model's generalization and inference capabilities when the distribution of the training set and test set are different. Dr. Spider is based on the Spider dev set and introduces various types of perturbations, which can be found in detail in the original paper. Meanwhile, we will also include the performance on the Spider dev set in future versions.
I am concerned about the execution accuracy of ZeroNL2SQL on the dev set of spider, but this does not seem to be shown in the paper and repo. It would be appreciated if the authors could provide this result.