AliyunContainerService / gpushare-scheduler-extender

GPU Sharing Scheduler for Kubernetes Cluster
Apache License 2.0
1.4k stars 308 forks source link

how to get GPU metric data of pod deployed by this plugin #97

Closed lazywhite closed 3 years ago

lazywhite commented 4 years ago

https://github.com/NVIDIA/k8s-device-plugin https://github.com/NVIDIA/gpu-monitoring-tools/tree/master/exporters/prometheus-dcgm/k8s/pod-gpu-metrics-exporter

using these tools succeed to get gpu metric data, but this repo conflict with nvidia-device-plugin, so I cant get metric data any more.

Is there monitoring tool to adapt this repo?

zhaogaolong commented 4 years ago

I searched a lot and didn't. so I developed it. but it not free. if you want, Please leave your email。

lazywhite commented 4 years ago

I searched a lot and didn't. so I developed it. but it not free. if you want, Please leave your email。

jsxcppking@gmail.com

ZhangSIming-blyq commented 4 years ago

I searched a lot and didn't. so I developed it. but it not free. if you want, Please leave your email。

siming_zhang@shannonai.com, I want it too, thanks.

ZinuoCai commented 3 years ago

@zhaogaolong zinuocai@gmail.com

zhaogaolong commented 3 years ago

@zhaogaolong zinuocai@gmail.com

我的东家不同意我这样做,很抱歉,代码是发不了了,(丧 @ZinuoCai @ZhangSIming-blyq @lazywhite