tensorflow / agents

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
Apache License 2.0
2.79k stars 722 forks source link

Python.NET & TensorFlow & CUDA: Could not load dynamic library 'cublas64_11.dll' #593

Closed alexhiggins732 closed 3 years ago

alexhiggins732 commented 3 years ago

Environment

I am current working on using Python.NET to build C# environments for interaction TensorFlow Agents and am receiving a TensorFlow error attempting to load Cuda DLLs.

When I run pure python examples Tensor flow loads the CUDA DLLs without issue:

2021-04-19 03:22:41.062449: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-04-19 03:22:41.062943: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-04-19 03:22:41.063347: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-04-19 03:22:41.063709: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-04-19 03:22:41.064088: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-04-19 03:22:41.064455: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-04-19 03:22:41.064832: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-04-19 03:22:41.065202: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll

However, when I run an environment that uses a Python environment that is essentially a wrapper for an environment written in C# using Python.Net is recieve errors the Cuda DLLs were not found:

2021-04-19 03:15:14.884746: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cublas64_11.dll'; dlerror: cublas64_11.dll not found
2021-04-19 03:15:14.885031: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cublasLt64_11.dll'; dlerror: cublasLt64_11.dll not found
2021-04-19 03:15:14.885281: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found
2021-04-19 03:15:14.885586: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2021-04-19 03:15:14.885851: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found
2021-04-19 03:15:14.886174: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cusparse64_11.dll'; dlerror: cusparse64_11.dll not found
2021-04-19 03:15:14.886454: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'cudnn64_8.dll'; dlerror: cudnn64_8.dll not found

Minimal code to reproduce the issue:

import tensorflow as tf
from TicTacToeSharpEnvironmentWrapper import TicTacToeEnvironment
env = TicTacToeEnvironment()
physical_devices = tf.config.list_physical_devices('GPU')

With TicTacToeSharpEnvironmentWrapper.py

import tensorflow as tf
from tf_agents.environments import py_environment
from tf_agents.specs import BoundedArraySpec
from tf_agents.trajectories.time_step import StepType
from tf_agents.trajectories.time_step import TimeStep
import numpy as np

assembly_path1 = r"C:\DesktopGym\bin\Debug"
import sys

sys.path.append(assembly_path1)
import clr
clr.AddReference("GymSharp")
from GymSharp import TicTacToeSharpEnvironment
"""A CSharp environment for Tic-Tac-Toe game."""
class TicTacToeEnvironment(py_environment.PyEnvironment):
  """A state-settable environment for Tic-Tac-Toe game.
  """

def __init__(self):
    super(TicTacToeEnvironment, self).__init__()
    self.sharp_env = TicTacToeSharpEnvironment()

TicTacToeSharpEnvironment is a c# class libary compiled as 64bit dll

public class TicTacToeSharpEnvironment
{
    static TicTacToeSharpEnvironment()
    {
        PythonInitiliazer.InitializePython();
    }
}

And PythonInitiliazer is used to initalize Python.Net

public class PythonInitiliazer
{
    static PythonInitiliazer()
    {
        InitializePython();
    }
    static bool initialized;
    public static void InitializePython()
    {
        if (!initialized)
        {
            initPython();
            initialized = true;
        }
    }
    private static void initPython()
    {

        string pathToVirtualEnv = @"C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\";

        Environment.SetEnvironmentVariable("PATH", pathToVirtualEnv, EnvironmentVariableTarget.Process);
        Environment.SetEnvironmentVariable("PYTHONHOME", pathToVirtualEnv, EnvironmentVariableTarget.Process);
        Environment.SetEnvironmentVariable("PYTHONPATH", $"{pathToVirtualEnv}\\Lib\\site-packages;{pathToVirtualEnv}\\Lib;{pathToVirtualEnv}\\scripts", EnvironmentVariableTarget.Process);
        Runtime.PythonDLL = "C:\\Program Files (x86)\\Microsoft Visual Studio\\Shared\\Python37_64\\python37.dll";

        PythonEngine.PythonHome = pathToVirtualEnv;
        PythonEngine.PythonPath = Environment.GetEnvironmentVariable("PYTHONPATH", EnvironmentVariableTarget.Process);
        PythonEngine.Initialize();
    }
}

The full code works. The Python Wrapper of the C# environment passes the Tensorflow Agents unit tests for the Tic Tac Toe environment. The C# environment can be wrapped as Python or a Tensor flow environment and the various agents can train against the environment.

I don't think this is a compatibility issue using the x64 .Net DLL because I am using 64 bit python, but I a can't be certain.

alexhiggins732 commented 3 years ago

I solved this issue. It was due to bad Python.Net wiki documentation showing how to use Python.Net in a virtual environment.

The fix, for others facing this or very similar issues, is to not use the code in the Wiki:

var pathToVirtualEnv = @"path\to\env";

Environment.SetEnvironmentVariable("PATH", pathToVirtualEnv, EnvironmentVariableTarget.Process);

Doing so will overwrite your PATH environmental variable.

Instead, append the path to your python virtual environment to your existing PATH environmental variable.

Instead, append the path to your virtual environment to your PATH environmental variable.

string pathToVirtualEnv = @"path\to\env";

var path = Environment.GetEnvironmentVariable("PATH");
Environment.SetEnvironmentVariable("PATH", path + ";" + pathToVirtualEnv, EnvironmentVariableTarget.Process);