mindprince / gonvml

NVIDIA Management Library (NVML) bindings for Go
Apache License 2.0
107 stars 46 forks source link

Cannot dynamic link nvidia-ml and use gonvml at the same time #8

Open wangkechun opened 5 years ago

wangkechun commented 5 years ago

example failed to query device count:

package main

// #cgo LDFLAGS: -ldl -lnvidia-ml
// #cgo CFLAGS: -I /usr/local/cuda-8.0/include
/*
#include <stdio.h>
#include "nvml.h"
void testNvml()
{
    nvmlReturn_t result;
    unsigned int device_count, i;

    // First initialize NVML library
    result = nvmlInit();
    if (NVML_SUCCESS != result)
    {
        printf("Failed to initialize NVML: %s\n", nvmlErrorString(result));
        return;
    }

    result = nvmlDeviceGetCount(&device_count);
    if (NVML_SUCCESS != result)
    {
        printf("Failed to query device count: %s\n", nvmlErrorString(result));
        return;
    }
    printf("Found %d device%s\n\n", device_count, device_count != 1 ? "s" : "");
}
*/
import "C"

import (
    "fmt"

    "github.com/mindprince/gonvml"
)

func testGoNvml() (value []string) {
    err := gonvml.Initialize()
    if err != nil {
        return
    }
    defer func() { _ = gonvml.Shutdown() }()

    n, err := gonvml.DeviceCount()
    if err != nil {
        panic(err)
    }
    fmt.Println("testGoNvml gpu num:", n)
    return
}

func main() {
    C.testNvml()
    testGoNvml()
    C.testNvml()
}

output:

Failed to query device count: nvmlErrorString Function Not Found
testGoNvml gpu num: 8
Found 8 devices

example ok

package main

// #cgo LDFLAGS: -ldl -lnvidia-ml
// #cgo CFLAGS: -I /usr/local/cuda-8.0/include
/*
#include <stdio.h>
#include "nvml.h"
void testNvml()
{
    nvmlReturn_t result;
    unsigned int device_count, i;

    // First initialize NVML library
    result = nvmlInit();
    if (NVML_SUCCESS != result)
    {
        printf("Failed to initialize NVML: %s\n", nvmlErrorString(result));
        return;
    }

    result = nvmlDeviceGetCount(&device_count);
    if (NVML_SUCCESS != result)
    {
        printf("Failed to query device count: %s\n", nvmlErrorString(result));
        return;
    }
    printf("Found %d device%s\n\n", device_count, device_count != 1 ? "s" : "");
}
*/
import "C"

func main() {
    C.testNvml()
}

output:

Found 8 devices

This is a simple example, actually a possible situation:

a complex golang program import package a and b
package a import gonvml
package b call libnvidia-xxx use cgo
libnvidia-xxx dynamic link libnvidia-ml
wangkechun commented 5 years ago

Possible solution:

  1. Add the following line

// #cgo CFLAGS: -fvisibility=hidden

  1. Rename all duplicate symbols like

https://github.com/mindprince/gonvml/blob/b364b296c7320f5d3dc084aa536a3dba33b68f90/bindings.go#L51