DevQualityEval: An evaluation benchmark 📈 and framework to compare and evolve the quality of code generation of LLMs.
57
stars
3
forks
source link
Make sure to use uint64 consistently for metrics and scoring, and allow more task cases by always working on a clean repository #133
Closed
zimmski closed 1 month ago
@ruiAzevedo19 Merging directly, just FYI