-
During the investigation of the storage utilization metrics, I noticed the information is differently reported depending how I enter the OS metrics dashborad. Take for instance the hydra command: `hyd…
-
> Node count, disks per node, network per node, latency metrics from both cluster and disk
> Some notion of performance utilization, ie is it heavily loaded or typically idle
-
By default, the instructlab/training loop saves the full parameter / optimizer / lr_scheduler state every epoch. This default is configured in the TrainingArguments object, and isn't exposed to the CL…
-
We should track host-level metrics such as memory usage, disk usage, cpu usage, etc and combine that with the existing block processing time data `replayor` is already collecting to give a more holist…
-
Under Utilization, you might want to rename 'Disk Load' to 'Disk Usage'
'Disk Load' is an industry standard term for IOPs and their stats in a similar context.
-
## Describe the bug
There is an app with rather high IO (500-1000 IOPS). If there is a Longhorn background task (like replica rebuild or snapshot deletion), then the instance-manager pod running at…
-
"Disk Utilization" fails to display data:
"A Status: 500. Message: InfluxDB returned error: error parsing query: found DTBFAT0, expected ) at line 1, char 123"
-
`CUDA_VISIBLE_DEVICES=0,1 lm_eval --model vllm \
--model_args pretrained=/home/jovyan/data-vol-1/models/meta-llama__Llama3.1-70B-Instruct,tensor_parallel_size=2,dtype=auto,gpu_memory_utilization=…
-
### Versions
Core
Version is v5.18.3-550-g112b961 (Latest: null)
Branch is development
Hash is 112b9617 (Latest: 112b9617)
Web
Version is v5.21-1012-gc689fdc7 (Latest: null)
Branch…
-
The last 2 columns from running the command `docker container ls -s`
I've italicized the containers that are not created by this repository.
```
NAMES SIZE
_postgres …