-
If an error is thrown during execution, the entire trace is lost, include debug messages. Instead, all messages up to the error, as well as all debug messages, should be included in the log. (Ideall…
-
### Details
_No response_
Checklist
- [X] Modify `src/main.py` ✓ https://github.com/sweepai/evals/commit/35f37cdbd49a63bfdd7a39e2d65b0c186ad83c49
- [X] Ran sandbox for `src/main.py`. ✓ https://gi…
-
**Description**
Naga currently rejects programs with constant expressions such as this one:
```wgsl
const asdf = false && (123 / 0 > 0);
```
An uninformed reader might assume that divid…
-
i think the readme.md has some issues regarding to the evals
i just notice it with piqa, the numbers are too low compared to the actual paper
-
I'm keeping https://github.com/ErikBjare/are-copilots-local-yet up-to-date, and would love to see some codellama numbers given it's now SOTA :)
-
This is what I get with `master`:
```
~\.julia\dev\BlackBoxOptim\examples [master ≡]> julia .\rosenbrock_parallel.jl
Starting optimization with optimizer XNESOpt{Float64,RandomBound{ContinuousRectS…
-
### Describe the feature or improvement you're requesting
Currently evals framework does not support Azure openAI implementation. This is blocker if someone wants to use eval with Azure OpenAI implem…
-
If I have something like `@b rand(1000) sort!`, the first eval is much slower than subsequent evals within a given sample, which violates benchmarking assumptions and results in weird results. For exa…
-
-
### Details
_No response_
Checklist
- [X] Modify `src/main.py` ✓ https://github.com/sweepai/evals/commit/79c50514c76cc63da87009fa58909bf838a262c9
- [X] Ran sandbox for `src/main.py`. ✗
- [X] Modi…