AlibabaResearch / DAMO-ConvAI

DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.
MIT License
1.18k stars 186 forks source link

Incorrect evidence in dev set example, db_id: california_schools #63

Closed wbbeyourself closed 1 year ago

wbbeyourself commented 1 year ago

在 california_schools/frpm.csv 中,eligible free rate = Free Meal Count / Enrollment,同时 column 【Percent (%) Eligible Free (K-12)】确实是 【Free Meal Count】 / 【Enrollment】。

但是在 dev set中第一条测试数据就存在错误:

原始数据:

db_id: california_schools question: What is the highest eligible free rate for K-12 students in the schools in Alameda County? evidence: Eligible free rate for K-12 = FRPM Count (K-12) / Enrollment (K-12) Gold SQL: SELECT FRPM Count (K-12) / Enrollment (K-12) FROM frpm WHERE County Name = 'Alameda' ORDER BY (CAST(FRPM Count (K-12) AS REAL) / Enrollment (K-12)) DESC LIMIT 1

两处错误: (1) evidence 错误,正确的应该为 : Eligible free rate for K-12 = Free Meal Count (K-12) / Enrollment (K-12) (2) Gold SQL 写得复杂了,除法是多此一举。直接写成如下的即可,无需除法。因为 【Percent (%) Eligible Free (K-12)】就代表了 Eligible free rate for K-12

SELECT MAX(Percent (%) Eligible Free (K-12)) FROM frpm WHERE County Name = 'Alameda'

类似的问题可能还有不少。