symflower eval-dev-quality issues

symflower / eval-dev-quality

DevQualityEval: An evaluation benchmark 📈 and framework to compare and evolve the quality of code generation of LLMs.

https://symflower.com/en/company/blog/2024/dev-quality-eval-v0.4.0-is-llama-3-better-than-gpt-4-for-generating-tests/

MIT License

137 stars 5 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Move Dependency installation of Docker into multistage builds

#319 Munsio opened 3 months ago
0
Transpile repository for Ruby

#318 ruiAzevedo19 closed 3 months ago
0
Infer the language we want to transpile from the file extension, so other languages are supported for transpilation

#317 ruiAzevedo19 closed 3 months ago
0
Mistakes repository for Ruby

#316 ruiAzevedo19 closed 3 months ago
1
Light repository for Ruby

#315 ruiAzevedo19 closed 3 months ago
0
Free Github runner disk space by removing unnecessary pre packaged stuff

#314 Munsio closed 3 months ago
1
Do not register Ruby language yet, since the test execution feature is still in progress

#313 ruiAzevedo19 closed 3 months ago
0
Copy of evaluation data for kubernetes is not working

#312 Munsio opened 3 months ago
0
Introduce the Ruby language

#311 ruiAzevedo19 closed 3 months ago
0
302 without fix

#310 bauersimon closed 3 months ago
1
Add Ruby as Dependency to the Docker container

#309 Munsio closed 3 months ago
1
Copying docker results did not work without setting the `result-path` parameter

#308 Munsio closed 3 months ago
2
Follow-up: Use a JSON configuration file to set up an evaluation run

#307 ruiAzevedo19 opened 3 months ago
0
Check if the repository for the transpile task is valid before running the evaluation, so it is checked just once

#306 ruiAzevedo19 closed 3 months ago
0
Rethink retry logic for LLM Providers

#305 Munsio opened 3 months ago
0
Reporting tool does not work with `/**/` directory globing

#304 Munsio opened 3 months ago
5
Bump symflower version to stay on the latest version possible

#303 Munsio closed 3 months ago
0
Docker runtime broken on main

#302 bauersimon closed 3 months ago
5
Roadmap for v0.7.0

#301 bauersimon opened 3 months ago
0
Ruby support

#300 ahumenberger closed 2 weeks ago
4
HTML report for data visualization

#299 ruiAzevedo19 opened 3 months ago
0
Store models meta information in a CSV file, so it can be further used in data visualization

#298 ruiAzevedo19 closed 3 months ago
1
Update README to 0.5.0 blog post

#297 bauersimon closed 3 months ago
1
Data visualization based on evaluation CSV files

#296 ruiAzevedo19 opened 3 months ago
1
Script for v0.6.0 run

#295 Munsio closed 3 months ago
1
Update to latest Symflower version for improved static code repair

#294 bauersimon closed 3 months ago
0
0.6.0 test runs

#293 Munsio closed 3 months ago
1
Satisfy Kubernetes convention to only have alphanumeric characters and "-" in job names

#292 Munsio closed 3 months ago
0
Load selected models and repositories from the JSON configuration file, to set up an evaluation run

#291 ruiAzevedo19 closed 3 months ago
1
Only run model/provider as well as testdata checks on the host if the runtime is not containerized

#290 Munsio closed 3 months ago
0
K8s secrets

#289 Munsio closed 3 months ago
0
Write openrouter models to CSV and reject models that we want to ignore automatically

#288 bauersimon closed 3 months ago
0
Store repositories in JSON

#287 bauersimon closed 3 months ago
0
Openrouter Provider preferences

#286 Munsio opened 4 months ago
2
Write available and selected models into a configuration file for documentation/reproducibility

#285 bauersimon closed 4 months ago
1
Pull ollama models

#284 Munsio closed 4 months ago
0
Pull ollama models

#283 Munsio closed 4 months ago
0
Use a JSON configuration file to set up an evaluation run

#282 ruiAzevedo19 closed 3 months ago
1
fix, Ignore git and Maven directories when validating the code repair repository, since they do not need any validation

#281 ruiAzevedo19 closed 4 months ago
0
"symflower unit-tests" timeout error differs between Linux and Windows

#280 ruiAzevedo19 closed 1 month ago
1
Use a pinned version for Java 11 dependency

#279 Munsio closed 4 months ago
0
Log model responses as artifact in separate file

#278 ahumenberger closed 4 months ago
2
fix, Handle inconsistent timout error on Windows

#277 ruiAzevedo19 closed 4 months ago
0
Flaky test when testing `symflower unit-tests` timeout

#276 ruiAzevedo19 closed 4 months ago
0
fix, Use the correct Maven snapshot format in the Java test data, to have a cleaner output without warnings

#275 ruiAzevedo19 closed 4 months ago
0
Always numerize the result path of containerized runs to avoid I/O sync problems

#274 Munsio closed 4 months ago
0
Docker containers may use the same result-path

#273 Munsio closed 4 months ago
0
Copy cluster data and documentation update

#272 Munsio closed 4 months ago
0
Introduce the "report" command to combine multiple evaluations into a single file

#271 ruiAzevedo19 closed 4 months ago
0
Malformed Maven version

#270 Munsio closed 4 months ago
0

Previous Next