symflower / eval-dev-quality

DevQualityEval: An evaluation benchmark 📈 and framework to compare and evolve the quality of code generation of LLMs.
https://symflower.com/en/company/blog/2024/dev-quality-eval-v0.4.0-is-llama-3-better-than-gpt-4-for-generating-tests/
MIT License
137 stars 5 forks source link

Use new Symflower version which reduces error output of the "fix" command #323

Closed bauersimon closed 3 months ago

zimmski commented 3 months ago

@bauersimon where can i see an instance of symflower fix misbehaving?

bauersimon commented 3 months ago

@Munsio should have some examples... We did not upload these logs to GH cause it were looots of GBs.

zimmski commented 3 months ago

@Munsio should have some examples... We did not upload these logs to GH cause it were looots of GBs.

@Munsio can you show me?

Also, are there really no threads for comments on GH? WTH...