Spider results - Githubissues

Q: I wonder what is the difference between these two metrics. A: EX and TS are distinct evaluation metrics, with TS offering greater reliability by significantly minimizing false positives commonly found in EX, thanks to the incorporation of "test suites". For an in-depth understanding, I highly recommend the paper, 'Semantic Evaluation for Text-to-SQL with Distilled Test Suites,' available at this link.

Q: Are they both obtained with the evaluation code in spider? A: While EX and TS employ the same evaluation code, they differ in the databases used in the evaluation. EX utilizes databases officially supplied by the Spider benchmark, whereas TS employs a set of test suites developed and released by Ruiqi Zhong et al.

RUCKBReasoning / codes

Spider results #4