This live leaderboard evaluates the LLM's ability to call functions (aka tools) accurately. This leaderboard consists of real-world data and will be updated periodically. For more information on the evaluation dataset and methodology, please refer to our blog post and code release.
Berkeley Function-Calling Leaderboard
Description
This live leaderboard evaluates the LLM's ability to call functions (aka tools) accurately. This leaderboard consists of real-world data and will be updated periodically. For more information on the evaluation dataset and methodology, please refer to our blog post and code release.
Leaderboard
Source
Berkeley Function-Calling Leaderboard
Suggested labels