Canner / WrenAI

🚀 An open-source SQL AI (Text-to-SQL) Agent that empowers data, product teams to chat with their data. 🤘
https://getwren.ai/oss
GNU Affero General Public License v3.0
2.04k stars 211 forks source link

[EPIC] First Version of Wren AI Service Evaluation Framework #355

Closed cyyeh closed 3 months ago

cyyeh commented 5 months ago

Context

In order to successfully deliver the great generative AI project, the necessity and huge impact of a robust evaluation system is self-evident. Without an useful evaluation system, we can't easily know how good or bad our system performs; also the evaluation process can't be automated.(Despite having evaluation system, human-in-the-loop is still needed; having one could definitely reduce lots of human effort). If you are curious about the topic, we've learned a lot from the community, and hope resources in the section of References might help you grasp the concept behind building one.

The evaluation framework is purposely built for WrenAI, and there will be more and more AI pipelines coming along the way. However, for the first version of our evaluation framework, we'll focus on the most important ai pipeline and the most used by users: ask pipeline, which is basically the text-to-sql task.

Goal

Tasks

References