Closed nongfang55 closed 2 weeks ago
We are currently organizing prompts for two SQL-related metrics here. https://github.com/QwenLM/Qwen2.5-Coder/tree/codeqwen1_5/evaluation/text_to_sql
As for the specific evaluation scripts, we are still working ona clean open-source version.
期待 release! // waiting for release!
看到贵团队放出了部分评估 benchmark 的逻辑,希望参考在 Spider 和 BIRD-SQL 的评估实现。这两个 benchmark 本身在 opencompass 和 harness 都没有集成