pythonnet / pythonnet

Python for .NET is a package that gives Python programmers nearly seamless integration with the .NET Common Language Runtime (CLR) and provides a powerful application scripting tool for .NET developers.
http://pythonnet.github.io
MIT License
4.7k stars 705 forks source link

virtual environment: either crashes or cannot find modules #1478

Open Felk opened 3 years ago

Felk commented 3 years ago

Environment

Details

according to https://github.com/pythonnet/pythonnet/issues/984 .NET Core is supported, which I hope includes .NET 5.0 according to https://github.com/pythonnet/pythonnet/issues/1389 Python 3.9 is supported in pythonnet 2.5, though I was unable to find 2.5 on nuget, so I went with a 3.0 preview in the hopes that it will work too.

Describe what you were trying to get done.

Embed python in a C# application with a virtual environment (venv) and import a module from it.

I have set up a virtual environment at C:\Users\felk\venv39 and verified that it works and I can use modules that are available in the venv but not globally:

PS C:\Users\felk> python -c 'import chat_downloader; print(True)'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'chat_downloader'
PS C:\Users\felk> .\venv39\Scripts\Activate.ps1
(venv39) PS C:\Users\felk> python -c 'import chat_downloader; print(True)'
True

According to the wiki this is the code to get virtual environments working:

var pathToVirtualEnv = @"path\to\env";

// be sure not to overwrite your existing "PATH" environmental variable.
var path = Environment.GetEnvironmentVariable("PATH").TrimEnd(';');
path = string.IsNullOrEmpty(path) ? pathToVirtualEnv : path + ";" + pathToVirtualEnv;
Environment.SetEnvironmentVariable("PATH", path, EnvironmentVariableTarget.Process);
Environment.SetEnvironmentVariable("PATH", pathToVirtualEnv, EnvironmentVariableTarget.Process);
Environment.SetEnvironmentVariable("PYTHONHOME", pathToVirtualEnv, EnvironmentVariableTarget.Process);
Environment.SetEnvironmentVariable("PYTHONPATH", $"{pathToVirtualEnv}\\Lib\\site-packages;{pathToVirtualEnv}\\Lib", EnvironmentVariableTarget.Process);

PythonEngine.PythonHome = pathToVirtualEnv;
PythonEngine.PythonPath = Environment.GetEnvironmentVariable("PYTHONPATH", EnvironmentVariableTarget.Process);

However it is not clear to me what comes after that, most importantly where do I call PythonEngine.Initialize();, so I attempted to piecemeal something together:

However, when I try to import a module present in the virtual environment, it is not being found. I can observe that the venv path is part of PythonEngine.PythonPath but not part of sys.path by checking after setting PythonEngine.PythonPath:

dynamic sys = Py.Import("sys");
Console.WriteLine(sys.path);
Console.WriteLine(PythonEngine.PythonPath);

this results in

['C:\\Program Files\\Python39\\python39.zip', 'C:\\Program Files\\Python39\\Lib', 'C:\\Program Files\\Python39\\DLLs', 'S:\\projects\\myproject\\bin\\Debug\\net5.0', 'C:\\Users\\felk\\AppData\\Roamin
g\\Python\\Python39\\site-packages', 'C:\\Program Files\\Python39', 'C:\\Program Files\\Python39\\lib\\site-packages', 'C:\\Program Files\\dotnet\\shared\\Microsoft.NETCore.App\\5.0.7\\']
C:/Users/felk/venv39/Lib/site-packages;C:/Users/felk/venv39/Lib;

This seems to be unrelated, but I also removed this line from the wiki's example as it seems nonsensical to override the PATH with the venv path if it was just set to the correctly appended one a line above, and since the comment above literally just said be sure not to overwrite your existing "PATH" environmental variable:

Environment.SetEnvironmentVariable("PATH", pathToVirtualEnv, EnvironmentVariableTarget.Process);

I also tried appending to PythonEngine.PythonPath instead of replacing it, as described in https://github.com/pythonnet/pythonnet/issues/1348 but that had no effect either.

Felk commented 3 years ago

Some more random tinkering led me to a solution. The following order of operations works, deviating from the wiki:

  1. Set PythonEngine.PythonPath
  2. Set PythonEngine.PythonHome
  3. PythonEngine.Initialize();

Doing things in this order does not crash and also causes the new python path to be visible in sys.path. My complete example looks something like this:

Runtime.PythonDLL = "C:/Program Files/Python39/python39.dll";
string pathToVirtualEnv = "C:/Users/felk/venv39";

string path = Environment.GetEnvironmentVariable("PATH")!.TrimEnd(Path.PathSeparator);
path = string.IsNullOrEmpty(path) ? pathToVirtualEnv : path + Path.PathSeparator + pathToVirtualEnv;
Environment.SetEnvironmentVariable("PATH", path, EnvironmentVariableTarget.Process);
Environment.SetEnvironmentVariable("PYTHONHOME", pathToVirtualEnv, EnvironmentVariableTarget.Process);
Environment.SetEnvironmentVariable("PYTHONPATH",
    $"{pathToVirtualEnv}/Lib/site-packages{Path.PathSeparator}" + 
    $"{pathToVirtualEnv}/Lib{Path.PathSeparator}", EnvironmentVariableTarget.Process);

PythonEngine.PythonPath = PythonEngine.PythonPath + Path.PathSeparator +
                          Environment.GetEnvironmentVariable("PYTHONPATH", EnvironmentVariableTarget.Process);
PythonEngine.PythonHome = pathToVirtualEnv;

PythonEngine.Initialize();

dynamic sys = Py.Import("sys");
Console.WriteLine(sys.path);
Console.WriteLine(PythonEngine.PythonPath);

PythonEngine.BeginAllowThreads();
// ... any further code wrapped in using(Py.GIL()) { ... }

console output:

['C:\\Program Files\\Python39\\python39.zip', 'C:\\Program Files\\Python39\\Lib', 'C:\\Program Files\\Python39\\DLLs', 'S:\\projects\\myproject\\bin\\Debug\\net5.0', 'C:\\Users\\felk\\venv3
9\\Lib\\site-packages', 'C:\\Users\\felk\\venv39\\Lib', 'C:\\Users\\felk\\AppData\\Roaming\\Python\\Python39\\site-packages', 'C:\\Program Files\\dotnet\\shared\\Microsoft.NETCore.App\\5.0.7\\']
C:\Program Files\Python39\python39.zip;C:\Program Files\Python39\Lib\;C:\Program Files\Python39\DLLs\;S:\projects\myproject\bin\Debug\net5.0;C:/Users/felk/venv39/Lib/site-packages;C:/Users/fe
lk/venv39/Lib;

This seems to work for me, but I'll leave this issue open for now in case anyone wants to adjust the wiki.

bpdavis86 commented 3 years ago

I am appending my own troubles with virtual environments to this open issue rather than create a new one.

Environment

Issue

I spent better part of 2 days trying to get virtual environments working with pythonnet. Part of the problem is that there are a number of various methods to create virtual environments and they may end up creating slightly different "flavored" environments. For example, see:

https://stackoverflow.com/questions/41573587/what-is-the-difference-between-venv-pyvenv-pyenv-virtualenv-virtualenvwrappe

This does not even cover conda environments which are totally separate.

In order to better understand how virtual environments can be made to work with pythonnet, you have to dig into the details of how Python sets up sys.path and how this behaves with virtual environments, in particular, PEP 405 "lightweight" environments such as those created with the venv module.

Key Reference Material:

If one reads through these documents, one finds that all things begin with sys.executable. Python walks up the path looking for its system libraries based on this location except in the case where it finds the file "pyvenv.cfg" one level above the executable. This is the extension provided by PEP 405 to enable lightweight virtual environments. In this case, the pyvenv.cfg "home" key tells Python where the base install is.

This already presents a problem, because let's see how sys.executable gets set from a C# program using pythonnet to embed Python. Here is a test program to expose some details of the resulting Python environment.

using System;
using System.Collections.Generic;
using Python.Runtime;
namespace TestPythonnet
{
    class Program
    {
        static void Main(string[] args)
        {
            // using my base python install, not a venv            
            var pathToBaseEnv = @"C:\Users\myuser\AppData\Local\Programs\Python\Python38";
            Environment.SetEnvironmentVariable("PYTHONHOME", pathToBaseEnv, EnvironmentVariableTarget.Process);
            Runtime.PythonDLL = pathToBaseEnv + @"\python38.dll";

            PythonEngine.Initialize();
            using (Py.GIL())
            {
                dynamic sys = Py.Import("sys");
                dynamic os = Py.Import("os");
                Console.WriteLine($"PYTHONHOME: {os.getenv("PYTHONHOME")}");
                Console.WriteLine($"PYTHONPATH: {os.getenv("PYTHONPATH")}");
                Console.WriteLine($"sys.executable: {sys.executable}");
                Console.WriteLine($"sys.prefix: {sys.prefix}");
                Console.WriteLine($"sys.base_prefix: {sys.base_prefix}");
                Console.WriteLine($"sys.exec_prefix: {sys.exec_prefix}");
                Console.WriteLine($"sys.base_exec_prefix: {sys.base_exec_prefix}");
                Console.WriteLine("sys.path:");
                foreach (var p in sys.path)
                {
                    Console.WriteLine(p);
                }
                Console.WriteLine();
            }

            PythonEngine.Shutdown();

        }
    }
}
PYTHONHOME:
PYTHONPATH: C:\Users\myuser\OneDrive\Documents\Python Scripts
sys.executable: C:\Users\myuser\source\repos\TestPythonnet\TestPythonnet\bin\Debug\netcoreapp3.1\TestPythonnet.exe
sys.prefix: C:\Users\myuser\AppData\Local\Programs\Python\Python38
sys.base_prefix: C:\Users\myuser\AppData\Local\Programs\Python\Python38
sys.exec_prefix: C:\Users\myuser\AppData\Local\Programs\Python\Python38
sys.base_exec_prefix: C:\Users\myuser\AppData\Local\Programs\Python\Python38
sys.path:
C:\Users\myuser\OneDrive\Documents\Python Scripts
C:\Users\myuser\AppData\Local\Programs\Python\Python38\python38.zip
C:\Users\myuser\AppData\Local\Programs\Python\Python38\Lib
C:\Users\myuser\AppData\Local\Programs\Python\Python38\DLLs
C:\Users\myuser\source\repos\TestPythonnet\TestPythonnet\bin\Debug\netcoreapp3.1
C:\Users\myuser\AppData\Local\Programs\Python\Python38
C:\Users\myuser\AppData\Local\Programs\Python\Python38\lib\site-packages
C:\Program Files\dotnet\shared\Microsoft.NETCore.App\3.1.18\

There are two major points to notice here:

  1. The setting of the PYTHONHOME environment variable did nothing for the environment variables inside Python.
  2. The executable detected is the embedding .NET program rather than the Python interpreter.

The first issue may be related to this post regarding SetEnvironmentVariable and setenv/getenv. I have not confirmed. https://stackoverflow.com/questions/4788398/changes-via-setenvironmentvariable-do-not-take-effect-in-library-that-uses-geten

Regardless, it seems that whatever environment settings one makes in code do not end up in the Python environment variables. Therefore, the environment manipulation in the example code is a red herring, except for the PATH variable, which would allow one to set the Python DLL name without a full absolute path if the PATH includes the Python base folder (with pythonXX.dll).

One should instead do such environment manipulations through the Run/Test environment, i.e. Visual Studio project setup for the debugging environment.

Regarding the second issue, one may well ask how this example succeeded if Python does not see the PYTHONHOME variable and the system libraries are not to be found along the path to sys.executable. Examining the Stack Overflow we find:

If it can't find these "landmark" files or sys.prefix hasn't been found yet, then python sets sys.prefix to a "fallback" value. Linux and Mac, for example, use pre-compiled defaults as the values of sys.prefix and sys.exec_prefix. Windows waits until sys.path is fully figured out to set a fallback value for sys.prefix.

For sys.path

If on Windows and no applocal = true was set in pyvenv.cfg, then the contents of the subkeys of the registry key HK_CURRENT_USER\Software\Python\PythonCore\\PythonPath\ are added, if any.

On Mac and Linux, the value of sys.exec_prefix is added. On Windows, the directory which was used (or would have been used) to search dynamically for sys.prefix is added.

Finally

At this stage on Windows, if no prefix was found, then python will try to determine it by searching all the directories in sys.path for the landmark files, as it tried to do with the directory of sys.executable previously, until it finds something. If it doesn't, sys.prefix is left blank.

Examining my registry shows that HKEY_CURRENT_USER\SOFTWARE\Python\PythonCore\3.8\PythonPath contains

C:\Users\myuser\AppData\Local\Programs\Python\Python38\Lib\;C:\Users\myuser\AppData\Local\Programs\Python\Python38\DLLs\

So it seems Windows was able to resolve the path issue using the registry and then applies this path to sys.prefix despite it having no resemblance to sys.executable.

What about the rest of the sys.path entries?

So, now that we've figured out what happened, how do we fix this to work with virtual environments properly, specifically PEP 405 "lightweight" environments? Well, unfortunately, it seems we can't replicate the setup exactly without lots of manual intervention. A PEP 405 environment looks something like this (on my system):

C:\Users\myuser\py38
    Include
        <empty>
    Lib
        site-packages
            <all your installed packages for the environment>
    Scripts
        activate
        activate.bat
        Activate.ps1
        ...
        pip.exe
        pip3.8.exe
        pip3.exe
        python.exe
        pythonw.exe
        <and others depending on what packages you've installed>
    share
    pyvenv.cfg

What is not included is pythonXX.dll, which is exactly the file needed by pythonnet to function.

In "normal" operations, I would put C:\Users\myuser\py38\Scripts first in my path (using the activation script), which would launch Python using the python.exe contained there. Therefore, sys.executable would point to C:\Users\myuser\py38\Scripts\python.exe, Python would "walk back" through the filesystem and find pyvenv.cfg, which contains home = C:\Users\myuser\AppData\Local\Programs\Python\Python38. Therefore, we get this environment

# From PYTHONPATH
C:\Users\myuser\OneDrive\Documents\Python Scripts
# From base environment
C:\Users\myuser\AppData\Local\Programs\Python\Python38\python38.zip
C:\Users\myuser\AppData\Local\Programs\Python\Python38\DLLs
C:\Users\myuser\AppData\Local\Programs\Python\Python38\lib
C:\Users\myuser\AppData\Local\Programs\Python\Python38
# From venv site-packages
C:\Users\myuser\py38
C:\Users\myuser\py38\lib\site-packages
C:\Users\myuser\py38\lib\site-packages\win32
C:\Users\myuser\py38\lib\site-packages\win32\lib
C:\Users\myuser\py38\lib\site-packages\Pythonwin

Why don't we just set PYTHONHOME to the venv folder and be done with it? Unfortunately, this doesn't work because PYTHONHOME needs to point to the root of the Python install with the system libraries, which this lightweight venv does not have. (It seems PYTHONHOME is explicitly incompatible with venv because it is unset/cached in the activate script).

Proposed Solution

So, in order to replicate the venv setup, we need to follow a rather messy manual process.

In the following steps, all environment variables should be set outside of code (i.e. in your shell or Visual Studio project setup) in order to avoid the problem mentioned before where values do not propagate down into Python after being changed in C#.

  1. Get pythonnet to be able to find your python3X.dll. This can be done in a number of ways.

    • Set PYTHONNET_PYDLL environment variable to point directly to the DLL (absolute path).
    • Add your base install Python home folder (containing python3X.dll) to the top of PATH, then set PYTHONNET_PYDLL to python3X.dll.
    • Add your base install Python home folder (containing python3X.dll) to the top of PATH, then set PYTHONNET_PYVER to 3.X.
    • Set Runtime.PythonDLL to point directly to the DLL (absolute path) in code.
    • Add your base install Python home folder (containing python3X.dll) to the top of PATH, then set Runtime.PythonDLL to python3X.dll in code.
  2. Set your Python home explicitly. This will avoid any shenanigans regarding autodetection of the base environment when sys.executable is not python.exe. Again, this can be done in a couple ways.

    • Set PYTHONHOME environment to your base install Python home folder.
    • Set PythonEngine.PythonHome your base install Python home folder in code.
  3. Specify the virtual environment path somehow. You could invent an environment variable for this, or supply directly in code.

    var pathToVirtualEnv = Environment.GetEnvironmentVariable("PYTHONNET_PYVENV"); // I invented this new enviroment variable
    // or
    var pathToVirtualEnv = @'C:\Users\myuser\py38'; // path to my virtualenv
  4. If the virtualenvironment is enabled, enable the "no site" flag on the Python interpreter. Normally one does this by calling the interpreter with the -S option. Pythonnet provides a way to set this flag through PythonEngine.SetNoSiteFlag. Unfortunately this has a bug currently associated with it on Windows which is documented in this Github issue. To workaround the bug, one must ensure to access some part of the PythonEngine API to load up the DLL before calling PythonEngine.SetNoSiteFlag. However, this API access must not cause the interpreter to initialize. (If you set PythonEngine.PythonHome in the previous step, this will suffice.) For example:

    // Access the API so the DLL gets loaded
    if (!String.IsNullOrEmpty(pathToVirtualEnv))
    {
    // Access the API so the DLL gets loaded
    string version = PythonEngine.Version;
    // Now we can set the flag and have it stick
    PythonEngine.SetNoSiteFlag();
    }
  5. Load the interpreter and run the site package manually with modified settings.

    
    PythonEngine.Initialize();
    using (Py.GIL())
    {
    if (!String.IsNullOrEmpty(pathToVirtualEnv))
    {
        // fix the prefixes to point to our venv
        // (This is for Windows, there may be some difference with sys.exec_prefix on other platforms)
        dynamic sys = Py.Import("sys");
        sys.prefix = pathToVirtualEnv;
        sys.exec_prefix = pathToVirtualEnv;
    
        dynamic site = Py.Import("site");
        // This has to be overwritten because site module may already have been loaded by the interpreter (but not run yet)
        site.PREFIXES = new List<PyObject> { sys.prefix, sys.exec_prefix };
        // Run site path modification with tweaked prefixes
        site.main();
    }

}


Here is a full code example. I used as much setup in the external environment as possible (because hardcoding paths into software is not a great way to go.)

Environment variables (from launchSettings.json):
```json
{
  "profiles": {
    "TestPythonnet": {
      "commandName": "Project",
      "environmentVariables": {
        "PYTHONNET_PYVENV": "C:\\Users\\myuser\\py38",
        "PYTHONNET_PYVER": "3.8",
        "PYTHONHOME": "C:\\Users\\myuser\\AppData\\Local\\Programs\\Python\\Python38",
        "PATH": "C:\\Users\\myuser\\AppData\\Local\\Programs\\Python\\Python38; %PATH%"
      }
    }
  }
}

Program:

using System;
using System.Collections.Generic;
using Python.Runtime;

namespace TestPythonnet
{
    class Program
    {
        static void Main(string[] args)
        {

            var pathToVirtualEnv = Environment.GetEnvironmentVariable("PYTHONNET_PYVENV");

            if (!String.IsNullOrEmpty(pathToVirtualEnv))
            {
                // Access the API so the DLL gets loaded
                string version = PythonEngine.Version;
                // Now we can set the flag and have it stick
                PythonEngine.SetNoSiteFlag();
            }

            PythonEngine.Initialize();
            using (Py.GIL())
            {
                if (!String.IsNullOrEmpty(pathToVirtualEnv))
                {
                    // fix the prefixes to point to our venv
                    // (This is for Windows, there may be some difference with sys.exec_prefix on other platforms)
                    dynamic sys = Py.Import("sys");
                    sys.prefix = pathToVirtualEnv;
                    sys.exec_prefix = pathToVirtualEnv;

                    dynamic site = Py.Import("site");
                    // This has to be overwritten because site module may already have 
                    // been loaded by the interpreter (but not run yet)
                    site.PREFIXES = new List<PyObject> { sys.prefix, sys.exec_prefix };
                    // Run site path modification with tweaked prefixes
                    site.main();
                }

            }

            using (Py.GIL())
            {
                dynamic sys = Py.Import("sys");
                dynamic os = Py.Import("os");
                Console.WriteLine($"PYTHONHOME: {os.getenv("PYTHONHOME")}");
                Console.WriteLine($"PYTHONPATH: {os.getenv("PYTHONPATH")}");
                Console.WriteLine($"sys.executable: {sys.executable}");
                Console.WriteLine($"sys.prefix: {sys.prefix}");
                Console.WriteLine($"sys.base_prefix: {sys.base_prefix}");
                Console.WriteLine($"sys.exec_prefix: {sys.exec_prefix}");
                Console.WriteLine($"sys.base_exec_prefix: {sys.base_exec_prefix}");
                Console.WriteLine("sys.path:");
                foreach (var p in sys.path)
                {
                    Console.WriteLine(p);
                }
                Console.WriteLine();
            }

            PythonEngine.Shutdown();
        }
    }
}

Output:

PYTHONHOME: C:\Users\myuser\AppData\Local\Programs\Python\Python38
PYTHONPATH: C:\Users\myuser\OneDrive\Documents\Python Scripts
sys.executable: C:\Users\myuser\source\repos\TestPythonnet\TestPythonnet\bin\Debug\netcoreapp3.1\TestPythonnet.exe
sys.prefix: C:\Users\myuser\py38
sys.base_prefix: C:\Users\myuser\AppData\Local\Programs\Python\Python38
sys.exec_prefix: C:\Users\myuser\py38
sys.base_exec_prefix: C:\Users\myuser\AppData\Local\Programs\Python\Python38
sys.path:
C:\Users\myuser\OneDrive\Documents\Python Scripts
C:\Users\myuser\AppData\Local\Programs\Python\Python38\python38.zip
C:\Users\myuser\AppData\Local\Programs\Python\Python38\DLLs
C:\Users\myuser\AppData\Local\Programs\Python\Python38\lib
C:\Users\myuser\source\repos\TestPythonnet\TestPythonnet\bin\Debug\netcoreapp3.1
C:\Program Files\dotnet\shared\Microsoft.NETCore.App\3.1.18
C:\Users\myuser\py38
C:\Users\myuser\py38\lib\site-packages
C:\Users\myuser\py38\lib\site-packages\win32
C:\Users\myuser\py38\lib\site-packages\win32\lib
C:\Users\myuser\py38\lib\site-packages\Pythonwin

This is pretty close to the environment generated by running python normally through the venv, modulo the extra .NET folders.

Proposed Enhancement Request

Obviously, this is rather ugly and unfortunate. The root cause of the problem is that we cannot control the value of sys.executable so that Python can follow its standard process to load the venv. Fortunately, there is a new config API added to Python 3.8 with PEP 587. This API allows fine-grained control over the startup Python configuration, including sys.executable. Implementing access to this API through pythonnet would allow us to simply set sys.executable to the venv version of python.exe and have everything work from there.

filmor commented 3 years ago

Thank you for the detailed report. PRs for this are very welcome!

A few comments:

lostmsu commented 3 years ago

@Meisterrichter offtopic idea: if you want your .NET app run within a virtualenv, can you maybe launch it from Python executable? E.g. do

# TODO: setup clr first

import clr
clr.AddReference('MyApp')
from MyApp import PythonEntryPoint
PythonEntryPoint.PublicMainFunction()

?