-
### Your current environment
I'm using 8 A100 GPUs with 40GB each to deploy LLaMA 3 70B. Under high concurrency, the average GPU utilization is only 50%. Why does the GPU utilization fluctuate so muc…
-
When I try to use this app, I constantly run into memory errors (see error log below). This occurs even if it is the only app I have open, and can happen after just a couple minutes of swiping left th…
-
`1.1g 219800 S 47.5 56.4 20:09.63 rlb-stats`
it eats a lot of RAM, a lot
-
High memory utilization ( >90% for 15m) in one of the wazuh host
-
### Your current environment
Deploy using llama factory
CUDA_VISIBLE_DEVICES=0 API_PORT=9092 python src/api_demo.py
--model_name_or_path /save_model/qwen1_5_7b_pcb_merge
--template qwen
--infer_b…
-
### Description
since 24.1.3 version excessive consumption of resources RAM
start dbeaver
create 2 connection.
No opened objects, no opened editor
![image](https://github.com/user-attachmen…
-
### What happened?
one or more objects failed to apply, reason: "" is invalid: patch: Invalid value: "map[metadata:map[annotations:map[kubectl.kubernetes.io/last-applied-configuration:{\"apiVersion…
-
### Component(s)
receiver/hostmetrics
### What happened?
## Description
My otel collector is running in a container. I've followed the documentation to ensure that it is actually monitoring the ho…
-
**Description**
A clear and concise description of what the bug is.
I'm running Triton Inference Server with vLLM backend as a container on Kubernetes.
I followed the [Triton metrics documentatio…
-
### Terraform Core Version
1.19
### AWS Provider Version
5.63.0,5.63.1
### Affected Resource(s)
* aws_sagemaker_endpoint
### Expected Behavior
Modifying a SageMaker endpoint should time out if …