All-Hands-AI / OpenHands

🙌 OpenHands: Code Less, Make More
https://all-hands.dev
MIT License
37.37k stars 4.23k forks source link

eval: add commit0 benchmark #5153

Closed nanjiangwill closed 3 days ago

nanjiangwill commented 5 days ago

End-user friendly description of the problem this fixes or functionality that this introduces


Give a summary of what the PR does, explaining any non-trivial design decisions

Commit0 is a from scratch AI coding challenge.

The benchmark consists of 57 core Python libraries. The challenge is to rebuild these libraries and pass their unit tests. All libraries have:

This PR adds necessary functions to run commit0 with OpenHands


Link of any specific issues this addresses