issues
search
symflower
/
eval-dev-quality
DevQualityEval: An evaluation benchmark 📈 and framework to compare and evolve the quality of code generation of LLMs.
https://symflower.com/en/company/blog/2024/dev-quality-eval-v0.4.0-is-llama-3-better-than-gpt-4-for-generating-tests/
MIT License
137
stars
5
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Move Dependency installation of Docker into multistage builds
#319
Munsio
opened
3 months ago
0
Transpile repository for Ruby
#318
ruiAzevedo19
closed
3 months ago
0
Infer the language we want to transpile from the file extension, so other languages are supported for transpilation
#317
ruiAzevedo19
closed
3 months ago
0
Mistakes repository for Ruby
#316
ruiAzevedo19
closed
3 months ago
1
Light repository for Ruby
#315
ruiAzevedo19
closed
3 months ago
0
Free Github runner disk space by removing unnecessary pre packaged stuff
#314
Munsio
closed
3 months ago
1
Do not register Ruby language yet, since the test execution feature is still in progress
#313
ruiAzevedo19
closed
3 months ago
0
Copy of evaluation data for kubernetes is not working
#312
Munsio
opened
3 months ago
0
Introduce the Ruby language
#311
ruiAzevedo19
closed
3 months ago
0
302 without fix
#310
bauersimon
closed
3 months ago
1
Add Ruby as Dependency to the Docker container
#309
Munsio
closed
3 months ago
1
Copying docker results did not work without setting the `result-path` parameter
#308
Munsio
closed
3 months ago
2
Follow-up: Use a JSON configuration file to set up an evaluation run
#307
ruiAzevedo19
opened
3 months ago
0
Check if the repository for the transpile task is valid before running the evaluation, so it is checked just once
#306
ruiAzevedo19
closed
3 months ago
0
Rethink retry logic for LLM Providers
#305
Munsio
opened
3 months ago
0
Reporting tool does not work with `/**/` directory globing
#304
Munsio
opened
3 months ago
5
Bump symflower version to stay on the latest version possible
#303
Munsio
closed
3 months ago
0
Docker runtime broken on main
#302
bauersimon
closed
3 months ago
5
Roadmap for v0.7.0
#301
bauersimon
opened
3 months ago
0
Ruby support
#300
ahumenberger
closed
2 weeks ago
4
HTML report for data visualization
#299
ruiAzevedo19
opened
3 months ago
0
Store models meta information in a CSV file, so it can be further used in data visualization
#298
ruiAzevedo19
closed
3 months ago
1
Update README to 0.5.0 blog post
#297
bauersimon
closed
3 months ago
1
Data visualization based on evaluation CSV files
#296
ruiAzevedo19
opened
3 months ago
1
Script for v0.6.0 run
#295
Munsio
closed
3 months ago
1
Update to latest Symflower version for improved static code repair
#294
bauersimon
closed
3 months ago
0
0.6.0 test runs
#293
Munsio
closed
3 months ago
1
Satisfy Kubernetes convention to only have alphanumeric characters and "-" in job names
#292
Munsio
closed
3 months ago
0
Load selected models and repositories from the JSON configuration file, to set up an evaluation run
#291
ruiAzevedo19
closed
3 months ago
1
Only run model/provider as well as testdata checks on the host if the runtime is not containerized
#290
Munsio
closed
3 months ago
0
K8s secrets
#289
Munsio
closed
3 months ago
0
Write openrouter models to CSV and reject models that we want to ignore automatically
#288
bauersimon
closed
3 months ago
0
Store repositories in JSON
#287
bauersimon
closed
3 months ago
0
Openrouter Provider preferences
#286
Munsio
opened
4 months ago
2
Write available and selected models into a configuration file for documentation/reproducibility
#285
bauersimon
closed
4 months ago
1
Pull ollama models
#284
Munsio
closed
4 months ago
0
Pull ollama models
#283
Munsio
closed
4 months ago
0
Use a JSON configuration file to set up an evaluation run
#282
ruiAzevedo19
closed
3 months ago
1
fix, Ignore git and Maven directories when validating the code repair repository, since they do not need any validation
#281
ruiAzevedo19
closed
4 months ago
0
"symflower unit-tests" timeout error differs between Linux and Windows
#280
ruiAzevedo19
closed
1 month ago
1
Use a pinned version for Java 11 dependency
#279
Munsio
closed
4 months ago
0
Log model responses as artifact in separate file
#278
ahumenberger
closed
4 months ago
2
fix, Handle inconsistent timout error on Windows
#277
ruiAzevedo19
closed
4 months ago
0
Flaky test when testing `symflower unit-tests` timeout
#276
ruiAzevedo19
closed
4 months ago
0
fix, Use the correct Maven snapshot format in the Java test data, to have a cleaner output without warnings
#275
ruiAzevedo19
closed
4 months ago
0
Always numerize the result path of containerized runs to avoid I/O sync problems
#274
Munsio
closed
4 months ago
0
Docker containers may use the same result-path
#273
Munsio
closed
4 months ago
0
Copy cluster data and documentation update
#272
Munsio
closed
4 months ago
0
Introduce the "report" command to combine multiple evaluations into a single file
#271
ruiAzevedo19
closed
4 months ago
0
Malformed Maven version
#270
Munsio
closed
4 months ago
0
Previous
Next