ShishirPatil / gorilla

Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
https://gorilla.cs.berkeley.edu/
Apache License 2.0
11.18k stars 920 forks source link

Leaderboard data bug + suggested fix #435

Open danieljannai21 opened 3 months ago

danieljannai21 commented 3 months ago

Hi guys,

Added this as a comment to a closed issue, and then thought it would be better as a separate issue, so here goes:

I noticed is that there are cases where the gold answer contains a parameter that doesn't appear in the description of the function (for example, the price parameter doesn't exist in the description of the book_room function in question 46 of execution_multiple_function category. I would suggest running an automatic validation that compares the function's documentation in the tools section, the way it's invoked in the gold answer, and the actual implementation (if exists) and makes sure they're all consistent. Shouldn't be that hard to implement.

Thanks!

HuanzhiMao commented 3 months ago

Hi @danieljannai21, Nice catch and thanks for pointing this out! This is indeed an oversight on our end. We plan to address this issue in the next release by updating the function docs. You are also welcome to make a PR if you would like to contribute :)