samkhur006 / awesome-llm-planning-reasoning

A curated collection of LLM reasoning and planning resources, including key papers, limitations, benchmarks, and additional learning materials.
MIT License
181 stars 10 forks source link

Consider adding AppWorld to the list #2

Closed HarshTrivedi closed 1 month ago

HarshTrivedi commented 2 months ago

Thanks for setting up this repository! Please consider adding AppWorld to it.

🔗 Website: https://appworld.dev/ 📄 Paper: https://arxiv.org/abs/2407.18901 🐦 Tweet: https://x.com/harsh3vedi/status/1818311843976233198 💬 Blog: https://appworld.dev/blog 🎬 Video(s): https://appworld.dev/video 🌎 Code: https://github.com/stonybrooknlp/appworld 🧭 Data (task, trajectories) explorer, playground: https://appworld.dev/task-explorer 🔍 API explorer: https://appworld.dev/api-explorer 📊 Leaderboard: https://appworld.dev/leaderboard

TLDR: Introduces AppWorld Engine, a high-fidelity execution environment of 9 day-to-day apps, operable via 457 APIs, populated with digital activities of 106 people living in a simulated world, and an associated benchmark of natural, diverse, and challenging autonomous agent tasks requiring rich and interactive coding.

It best fits the "Benchmarks" section.

Thank you!

samkhur006 commented 2 months ago

Thank you for the suggestion and for sharing this interesting paper on tool use! We're currently in the process of adding a few more papers to the repository, and we'll be sure to include this one in the "Benchmarks" section as well. Thanks again!

HarshTrivedi commented 2 months ago

Sounds good, thank you @samkhur006 !!