-
-
**What would you like to be added**:
Expose metrics about JobSet resource: https://jobset.sigs.k8s.io/docs/reference/jobset.v1alpha2/
**Why is this needed**:
For better monitoring of AI t…
-
Hello, I have two questions:
1. How do I view the metrics after the game is over:Win Rate | PBR | RUR | APU | TR
2. Run test_the_env.py, and when the game is over, you'll get an error:
![173148890…
-
We should track host-level metrics such as memory usage, disk usage, cpu usage, etc and combine that with the existing block processing time data `replayor` is already collecting to give a more holist…
-
Although a summary is printed at the end of each run, I think it would be good to collect more detailed metrics and store them in some sort of database.
The goal would be that I can query it to fin…
-
Basic metrics
like how to measure the converge rate as the assignment says.
j ~ (A(j, s) - A(max, s))
what it will be when "max" change, as well the "s" changes.
And explore how this differe…
-
### Description
Some runtime metrics, e.g.: `memStats.Alloc`, `memStats.Sys`, are missing.
### Steps To Reproduce
* By checking [the code](https://github.com/open-telemetry/opentelemetry-go-c…
-
### Description & Motivation
When training a model, I have to specify dataloaders, epochs, learning rate and I would like them to be logged by default (like huggingface).
(Could be a DeviceStatMo…
-
This test failed on a CI run on #6966:
https://github.com/oxidecomputer/omicron/pull/6966/checks?check_run_id=33061656597
Log showing the specific test failure:
https://buildomat.eng.oxide.comput…
-
**Is your feature request related to a problem? Please describe.**
I like graphs. We all like graphs. And alerts, too! I use Grafana to make me graphs and scream at me when something is wrong. I us…