symflower / eval-dev-quality

DevQualityEval: An evaluation benchmark 📈 and framework to compare and evolve the quality of code generation of LLMs.
https://symflower.com/en/company/blog/2024/dev-quality-eval-v0.4.0-is-llama-3-better-than-gpt-4-for-generating-tests/
MIT License
57 stars 3 forks source link

https://github.com/symflower/eval-dev-quality/pull/155/files missing a test #159

Closed zimmski closed 1 week ago

zimmski commented 3 weeks ago

All change need tests in TDD style. fail first, pass with changes. Same rules as internally.

Munsio commented 3 weeks ago

Well the testcase itself is using filepath.Join for the paths so if the test case is run on windows or macos or linux it is using the path separator for that particular OS

zimmski commented 3 weeks ago

What was the reason for changing this then? I thought you said something failed/did not work?

Munsio commented 3 weeks ago

The test case was wrong in the first place as it used hardcoded / for paths, because of the hardcoded nature the test succeeded on Windows even if it shouldn't.

By switching the testcase result to use filepath.Join instead of the hardcoded versions, it did in fact fail on Windows until I changed the implementation of TestFilePath to filepath.Join too.