Closed iamlockelightning closed 8 months ago
@iamlockelightning Thanks for your interest and great suggestion to AgentBench! We'd love to add more challenging samples to the DB task in the future work plan. Currently, we also encourage you to contribute your effort to the next version of AgentBench if you would love to. Please feel free to open a PR to add data/task/environment!
Thank you so much for publishing such an elegant framework for evaluating LLM Agents.
Would you consider adding more difficult data in the DB task? I see there are only single-table querying SQLs in the task, which is easy to solve and has some gap between real-world cases.
There are many other quality data such as Spider 1.0 that contain complex queries (multiple tables joining, etc,.).
Hope to see more complex SQL data in this task. 👍