NVIDIA / gpu-monitoring-tools

Tools for monitoring NVIDIA GPUs on Linux
Apache License 2.0
1.02k stars 301 forks source link

why require k8s.io/kubernetes project directly, is not recommended #123

Open utobe67 opened 4 years ago

utobe67 commented 4 years ago

Hi, this project's go.mod file require k8s.io/kubernetes which will casue "k8s.io/api@v0.0.0: reading k8s.io/api/go.mod at revision v0.0.0: unknown revision v0.0.0" problem. I find that is not recommended in the link below. https://github.com/kubernetes/kubernetes/issues/90358

rubu commented 3 years ago

Interrested in this too - I wanted to add this module in my go module, but after adding an import for github.com/NVIDIA/gpu-monitoring-tools/bindings/go/nvml I get the same error about k8s.io/api@v0.0.0. Since I am new to golang and don't have a clue on how to resolve this - what is the proper way of using this module inside another module? Or this this module simply in a broken state?

elezar commented 3 years ago

@rubu @utobe67 please check out to the NVIDA/go-nvml package instead.

rubu commented 3 years ago

@elezar thanks for the advice, but I need the bindings on Windows. In worst case I could take out the code from NVIDIA gpu-monitoring-tools or from go-nvml and just add the symbol loading from the dll from Windows. Do you know why these repos differ in this area ? gpu-monitoring-tools has Windows support, go-nvml has not, and as far as I see the difference is just API calls to load the library and get the exported symbols.

elezar commented 3 years ago

@rubu there is no explicit reason that we have not added Windows support to go-nvml. We currently rely heavily on community contributions for adding this as our focus is currently on Linux-based operating systems.

As you mention, the differences should be limited to loading the library and looking up symbols, so adding this to go-nvml should not be too problematic. Would you feel confident in adding this to go-nvml as this is the package we plan to support for interacting with NVML going forward? (or at the very least test the changes if we add this support).

rubu commented 3 years ago

@elezar I'll check out if go-nvml works for me and if so I may try to add the windows support and open a merge request. Btw after adding the import to go-nvml i get module github.com/NVIDIA/go-nvml@latest found (v0.11.1-0), but does not contain package github.com/NVIDIA/go-nvml. Should I be adding it in a different way than just importing github.com/NVIDIA/go-nvml?

elezar commented 3 years ago

The import statement should read:

import "github.com/NVIDIA/go-nvml/pkg/nvml"

with a quick start in the README

rubu commented 3 years ago

@elezar thanks, sorry, I did skip reading README :)

rubu commented 3 years ago

@elezar ok now I fail with dlfcn.h, so I I'll try to add Windows support.

MaximShepelev commented 3 years ago

@rubu any luck with dlfcn.h? Struggling with Windows support as well =)

P.S. this might help

rubu commented 3 years ago

@Madmaxguy I found an easier way - basically did not use this project but made C code that loads the nvml libraries directly. I have a feature branch which I can push to my fork that has the windows equivalent of dlfcn if this is still important for you.

MaximShepelev commented 3 years ago

@rubu I managed to compile another exporter for Windows which uses nvidia-smi cli tool already. But a version of DCGM exporter for Windows would be nicer as we use it for Linux hosts as well

rubu commented 3 years ago

@Madmaxguy yep, I ended up using windows_exporter with nvml and wdm APIs.