NVIDIA / go-nvml

Go Bindings for the NVIDIA Management Library (NVML)
Apache License 2.0
311 stars 64 forks source link

could not determine kind of name for C.nvmlComputeInstancePlacement_t #116 #34

Closed jshen28 closed 2 years ago

jshen28 commented 2 years ago

Hello, I am trying to run examples using make example and found that "types_gen.go" file is empty.

# make examples
c-for-go -out /root/go/src/github.com/NVIDIA/go-nvml/pkg /root/go/src/github.com/NVIDIA/go-nvml/gen/nvml/nvml.yml
  processing /root/go/src/github.com/NVIDIA/go-nvml/gen/nvml/nvml.yml done.
cp /root/go/src/github.com/NVIDIA/go-nvml/gen/nvml/*.go /root/go/src/github.com/NVIDIA/go-nvml/pkg/nvml
cd /root/go/src/github.com/NVIDIA/go-nvml/pkg/nvml; \
    go tool cgo -godefs types.go > types_gen.go; \
    go fmt types_gen.go; \
cd -> /dev/null
/root/go/src/github.com/NVIDIA/go-nvml/pkg/nvml/types.go:174:31: could not determine kind of name for C.nvmlComputeInstancePlacement_t
/root/go/src/github.com/NVIDIA/go-nvml/pkg/nvml/types.go:55:33: could not determine kind of name for C.nvmlRowRemapperHistogramValues_t
types_gen.go:1:1: expected 'package', found 'EOF'
rm -rf /root/go/src/github.com/NVIDIA/go-nvml/pkg/nvml/types.go /root/go/src/github.com/NVIDIA/go-nvml/pkg/nvml/_obj
grep -l -R "// WARNING: This file has automatically been generated on" pkg \
    | xargs sed -i -E 's#// WARNING: This file has automatically been generated on.*$#// WARNING: THIS FILE WAS AUTOMATICALLY GENERATED.#g'
grep -l -RE "// (.*) nvml/nvml.h:[0-9]+" pkg \
    | xargs sed -i -E 's#// (.*) nvml/nvml.h:[0-9]+$#// \1 nvml/nvml.h#g'
GOOS=linux go build ./examples/compute-processes
pkg/nvml/types_gen.go:1:1: expected 'package', found 'EOF'
make: *** [Makefile:51: example-compute-processes] Error 1
# go tool cgo -godefs types.go
/home/sjt/go-nvml-master/pkg/nvml/types.go:174:31: could not determine kind of name for C.nvmlComputeInstancePlacement_t
/home/sjt/go-nvml-master/pkg/nvml/types.go:55:33: could not determine kind of name for C.nvmlRowRemapperHistogramValues_t

the go version is

# go version go1.17.6 linux/amd64

On the server, cuda is not installed,

# nvidia-smi
Tue Jan 25 15:56:14 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.63       Driver Version: 470.63       CUDA Version: N/A      |
# go env
GO111MODULE="auto"
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/root/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/root/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3429732563=/tmp/go-build -gno-record-gcc-switches"

So what did I do wrong? Thank you.

elezar commented 2 years ago

Hi @jshen28 on which platform are you compiling go-nvml?

klueska commented 2 years ago

What exactly do you mean by "compile and run go-nvml from source". Can you provide the exact command you are running?

jshen28 commented 2 years ago

Hello, I've updated description and hopefully made myself a little bit more clear this time.. @elezar @klueska

jshen28 commented 2 years ago

I found it left different results on another machine,

# make test-bindings
c-for-go -out /root/go/src/github.com/NVIDIA/go-nvml/pkg /root/go/src/github.com/NVIDIA/go-nvml/gen/nvml/nvml.yml
  processing /root/go/src/github.com/NVIDIA/go-nvml/gen/nvml/nvml.yml done.
cp /root/go/src/github.com/NVIDIA/go-nvml/gen/nvml/*.go /root/go/src/github.com/NVIDIA/go-nvml/pkg/nvml
cd /root/go/src/github.com/NVIDIA/go-nvml/pkg/nvml; \
    go tool cgo -godefs types.go > types_gen.go; \
    go fmt types_gen.go; \
cd -> /dev/null
types_gen.go
rm -rf /root/go/src/github.com/NVIDIA/go-nvml/pkg/nvml/types.go /root/go/src/github.com/NVIDIA/go-nvml/pkg/nvml/_obj
grep -l -R "// WARNING: This file has automatically been generated on" pkg \
    | xargs sed -i -E 's#// WARNING: This file has automatically been generated on.*$#// WARNING: THIS FILE WAS AUTOMATICALLY GENERATED.#g'
grep -l -RE "// (.*) nvml/nvml.h:[0-9]+" pkg \
    | xargs sed -i -E 's#// (.*) nvml/nvml.h:[0-9]+$#// \1 nvml/nvml.h#g'
cd /root/go/src/github.com/NVIDIA/go-nvml/pkg/nvml; \
    go test -v .; \
cd -> /dev/null
# github.com/NVIDIA/go-nvml/pkg/nvml [github.com/NVIDIA/go-nvml/pkg/nvml.test]
./device.go:107:54: cannot use &CpuSet[0] (type *uint) as type *uint32 in argument to nvmlDeviceGetCpuAffinity
./device.go:137:58: cannot use &NodeSet[0] (type *uint) as type *uint32 in argument to nvmlDeviceGetMemoryAffinity
./device.go:149:65: cannot use &CpuSet[0] (type *uint) as type *uint32 in argument to nvmlDeviceGetCpuAffinityWithinScope
FAIL    github.com/NVIDIA/go-nvml/pkg/nvml [build failed]
FAIL

I slightly modify Makefile and use local jq executable instead of docker

jshen28 commented 2 years ago

hmm, looks like there is "nvml.h" collision.. after removing an old "nvml.h" file, could not determine kind of name for C.nvmlComputeInstancePlacement_t warning is gone.

klueska commented 2 years ago

Make sure you pin the version of c-for-go that you use to: 8eeee8c3b71f9c3c90c4a73db54ed08b0bba971d, i.e.:

go install github.com/xlab/c-for-go@8eeee8c3
jshen28 commented 2 years ago

thanks. btw where does cgo find the headers? I've put some older "nvml.h" in /tmp, and looks like cgo mistakenly uses them...

klueska commented 2 years ago

The temporary types.go file has #include "nvml.h" in it. This means use the nvml.h file that is in the local path (as opposed to a system level one). If cgo does its work under /tmp, it's possible a different nvml.h pre-existing under /tmp might conflict. I'm not positive of this, but it seems plausible.