defog-ai / sql-eval

Evaluate the accuracy of LLM generated outputs
Apache License 2.0
485 stars 52 forks source link

Evaluation metrics for SQL Query not found #128

Open andreped opened 2 months ago

andreped commented 2 months ago

From this blog post: https://defog.ai/blog/open-sourcing-sqleval/

I saw this sentence:

When looking through the code, I fail to see a metric being computed capturing this. I was expecting some kind of Model-Graded Eval using an LLM (or similar) to determine the complexity of the SQL query itself or something like this. Perhaps it is something using this: https://github.com/defog-ai/sql-eval/blob/main/eval/eval.py#L114

Right now, I only see metrics evaluating based on if the SQL query is valid, the runtime of completion, and what the resulting SQL Records contain (pandas.DataFrame). Am I missing something?