-
**Describe the bug**
The json reporter does not print results of benchmarks
**Expected result**
The xml reporter provides the benchmark results: [bench.xml.txt](https://github.com/user-attachme…
-
For historical reasons, there are currently TunableGroups and Tunables in mlos_bench (the benchmarking portion) which are converted to ConfigSpace HyperParameters for mlos_core (the optimizer portion)…
-
The carpenter bench is placed like a mob head for some reason. It places normally in single player, but in my friend's online server, it's just a mob head.
![image](https://github.com/user-attachment…
-
## Description of the issue
Restoring a v13 backup into v14 throws error.
## Context information (for bug reports)
**Output of `bench version`**
```
erpnext 14.69.0
frappe 14.74.0
```
…
-
### Clear and concise description of the problem
So we have already 150 tests we want to benchmark.
We need to manually copy them, replace the it() by a bench() and then run our benchmarking
### S…
-
**Describe the bug**
The ilab workflow tested in basic-workflow-tests.sh (https://github.com/instructlab/instructlab/blob/main/scripts/basic-workflow-tests.sh#L196) tests model evaluation, howeve…
-
roachtest.vm_preemption [failed](https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestNightlyGceBazel/16022118?buildTab=log) with [artifacts](https://teamcity.cockroachdb.c…
-
Now when we started two workflows, one with no flag, another with NOGIL enabled, then the bench_runner will output:
1. Merge base (no flag) vs PR/ref (no flag)
2. Merge base (NOGIL) vs PR/ref (NOGIL…
-
Develop a mechanism to include and execute external service-tasks as part of the `sema.bench` chain.
-
**Is your feature request related to a problem? Please describe.**
Currently `ilab model evaluate` only supports locally hosted models being used as judges for MT-Bench. FastChat has native support f…