We are planing to support CPU accelerate with gguf+Hugging Face transformers. And CPU and memory should be under the monitoring system. Since we use language models+adapters. So, the Disk usage should also under the monitoring.
As we discussed, for the APIs aggregator container. We don't need to deploy the monitoring service separately. It can directly use monitoring system inside kubernetes clusters.
Contact Details(optional)
No response
What feature are you requesting?
We only need to consider:
We are planing to support CPU accelerate with gguf+Hugging Face transformers. And CPU and memory should be under the monitoring system. Since we use language models+adapters. So, the Disk usage should also under the monitoring.
https://github.com/orgs/SkywardAI/discussions/10
Sub tasks