Benchmarking with SWE-Bench (or its "Lite" version)

Significant-Gravitas / AutoGPT-Code-Ability

🖥️ AutoGPT's Coding Ability - empowering everyone to build software using AI

MIT License

100 stars 26 forks source link

Open BradKML opened 4 weeks ago

BradKML commented 4 weeks ago

Currently there is this benchmark that is designed for full-on repo fixing https://www.swebench.com/ It is used for other software such as OpenDevin, AutoCodeRover, Aider, and SWE-Agent.