Closed Hambaobao closed 2 months ago
which eval-dev-quality
pleaseDid you try to run the benchmark using the container image? https://github.com/symflower/eval-dev-quality#run-the-evaluation-either-with-the-built-or-pulled-image
Thank you very much for your prompt reply. I'm not very familiar with Go. I just checked, and which eval-dev-quality
showed eval-dev-quality not found
. I eventually found eval-dev-quality
in /root/go/bin/
. I will try it again.
Hi, I am now able to successfully conduct tests, but the final log shows:
2024/09/11 17:37:36 Excluding model “custom-vllm/DeepSeek-Coder-V2-Lite-Instruct” for language “java” because it did not succeed basic checks
2024/09/11 17:37:36 Excluding model “custom-vllm/DeepSeek-Coder-V2-Lite-Instruct” for language “java” because it did not succeed basic checks
2024/09/11 17:37:36 Excluding model “custom-vllm/DeepSeek-Coder-V2-Lite-Instruct” for language “java” because it did not succeed basic checks
2024/09/11 17:37:36 Excluding model “custom-vllm/DeepSeek-Coder-V2-Lite-Instruct” for language “ruby” because it did not succeed basic checks
2024/09/11 17:37:36 Excluding model “custom-vllm/DeepSeek-Coder-V2-Lite-Instruct” for language “ruby” because it did not succeed basic checks
2024/09/11 17:37:36 Excluding model “custom-vllm/DeepSeek-Coder-V2-Lite-Instruct” for language “ruby” because it did not succeed basic checks
2024/09/11 17:37:36 Evaluation score for “custom-vllm/DeepSeek-Coder-V2-Lite-Instruct” (“category-unknown”): score=2259, coverage=760, files-executed=52, files-executed-maximum-reachable=77, generate-tests-for-file-character-count=55919, processing-time=242195, response-character-count=56956, response-no-error=77, response-no-excess=75, response-with-code=75, tests-passing=1220
It appears I am unable to test Java
and Ruby
languages. Is this normal, or is it caused by my environment not being configured correctly? My current environment can normally evaluate Multipl-E
.
Hello, and thank you for your assistance. I encountered some issues during the evaluation process. I followed your instructions to install all the necessary packages:
However, I'm facing an issue where the terminal returns:
eval-dev-quality: command not found
. Could you please help me identify what might be missing?