neptune-ai / neptune-client

📘 The experiment tracker for foundation model training
https://neptune.ai
Apache License 2.0
580 stars 63 forks source link

GPU power utilization added to monitoring namespace. #1854

Closed harishankar-gopalan closed 2 months ago

harishankar-gopalan commented 3 months ago

Hoping updation of CHANGELOG.md and docs update will be done from the Neptune team side. Fixes the issue in https://github.com/neptune-ai/neptune-client/issues/1853

SiddhantSadangi commented 3 months ago

Thanks for this PR @harishankar-gopalan 💟

We've handled the internals (changelog, tests, etc) and made a few changes to the implementation (replaced nvmlDeviceGetPowerManagementLimit with nvmlDeviceGetEnforcedPowerLimitConstraints for broader support and relevancy).

We'll merge and release sometime next week.

Meanwhile, would it be possible for you to check if all changes work as expected at your end?

harishankar-gopalan commented 3 months ago

Sure @SiddhantSadangi I would do that in the coming week and get back.

SiddhantSadangi commented 2 months ago

@harishankar-gopalan - this has been included in the 1.11.0 release 🎉

harishankar-gopalan commented 2 months ago

@harishankar-gopalan - this has been included in the 1.11.0 release 🎉

That's awesome @SiddhantSadangi