Closed DavidKorczynski closed 1 month ago
/gcbrun exp -n dk-test-to-harness-exp-2 -m vertex_ai_gemini-1-5 -i -b comparison from-test-small
/gcbrun exp -n dk-test-to-harness-exp-3 -m vertex_ai_gemini-1-5 -i -b from-test-small
The small is looking good https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-08-18-544-dk-test-to-harness-exp-3-from-test-small/index.html
Testing on the larger now
/gcbrun exp -n dk-test-to-harness-exp-4 -m vertex_ai_gemini-1-5 -i -b from-test-large
For many projects there are more test files available, but am limiting it to 9 files for now. This should be good for assessing quality for now. There is ~160 projects in this benchmark.