ganglia / gmond_python_modules

Repository of user-contributed Gmond Python DSO metric modules
http://sourceforge.net/apps/trac/ganglia/wiki/ganglia_gmond_python_modules
389 stars 354 forks source link

Ganglia GPU Monitoring Enhancements #155

Closed ranacseruet closed 10 years ago

ranacseruet commented 10 years ago

Metrics Added:

Metrics Modified:

Metrics Deleted:

Custom Graph Modifications:

dpocock commented 10 years ago

Rana, Github says there is a merge conflict. Can you provide an additional commit on your branch to resolve the merge conflict? Maybe somebody else has change one of the files while you were waiting for this to be accepted.

dpocock commented 10 years ago

Link to email discussion: http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg06725.html

Link to report: http://www.mail-archive.com/ganglia-developers@lists.sourceforge.net/msg06725/GPU_Monitoring_Enhancement_Report-_Draft.pdf

ranacseruet commented 10 years ago

Thanks Daniel. I have merged with the master branch and seems now it can be merged automatically.

eshelman commented 10 years ago

@ranacseruet This is good - I didn't realize we were missing this many metrics!

Would you please explain why you removed the power_man_mode and perf_state metrics? Those metrics seem to be working fine on our systems.

ranacseruet commented 10 years ago

@eshelman , thanks for reviewing! Actually most of the new metrics are from taken from new NVML version.

To answer your question, I have removed those metrics based on suggestion/discussion with my mentor Rajat Phull(rphull@nvidia.com). The reasons are given below:

power_man_mode: For the current generation and some of the recent generations of GPUs, the Power management module is always available on the devices. So we thought to turn it off. If needed by community, we can enable this.

perf_state: We thought that looking at current clocks gives a better representation of the performance state, and giving pstate in constant/host section may not add a lot of value. But again, if community needs it, we will get it back.

eshelman commented 10 years ago

Thanks for the details! Your explanation makes sense - these changes wouldn't be a problem for us.

ranacseruet commented 10 years ago

That's great then. Thanks!

dpocock commented 10 years ago

Thanks for this contribution Rana