symflower / eval-dev-quality

DevQualityEval: An evaluation benchmark 📈 and framework to compare and evolve the quality of code generation of LLMs.
https://symflower.com/en/company/blog/2024/dev-quality-eval-v0.4.0-is-llama-3-better-than-gpt-4-for-generating-tests/
MIT License
137 stars 5 forks source link

Kubernetes job watching does not work when erroring #329

Open Munsio opened 3 months ago

Munsio commented 3 months ago

When we are running multiple models within the kubernetes runner we are only watching for the completed type on the job. When an model is not available anymore the job is going into an error state which we also need to take into account.