Closed HarshTrivedi closed 1 month ago
Thank you for the suggestion and for sharing this interesting paper on tool use! We're currently in the process of adding a few more papers to the repository, and we'll be sure to include this one in the "Benchmarks" section as well. Thanks again!
Sounds good, thank you @samkhur006 !!
Thanks for setting up this repository! Please consider adding AppWorld to it.
🔗 Website: https://appworld.dev/ 📄 Paper: https://arxiv.org/abs/2407.18901 🐦 Tweet: https://x.com/harsh3vedi/status/1818311843976233198 💬 Blog: https://appworld.dev/blog 🎬 Video(s): https://appworld.dev/video 🌎 Code: https://github.com/stonybrooknlp/appworld 🧭 Data (task, trajectories) explorer, playground: https://appworld.dev/task-explorer 🔍 API explorer: https://appworld.dev/api-explorer 📊 Leaderboard: https://appworld.dev/leaderboard
TLDR: Introduces AppWorld Engine, a high-fidelity execution environment of 9 day-to-day apps, operable via 457 APIs, populated with digital activities of 106 people living in a simulated world, and an associated benchmark of natural, diverse, and challenging autonomous agent tasks requiring rich and interactive coding.
It best fits the "Benchmarks" section.
Thank you!