cachix / devenv

Fast, Declarative, Reproducible, and Composable Developer Environments
https://devenv.sh
Apache License 2.0
3.56k stars 259 forks source link

Python: manylinux #784

Closed domenkozar closed 1 month ago

domenkozar commented 9 months ago

I'm exploring an idea to provide manylinux packages for Python by default and allowing to extend that set with any other libraries.

This way we avoid LD_LIBRARY_PATH pollution of the developer environment.

An example using mkdocs-material:

{ pkgs, ... }:

{
  languages.python = {
    enable = true;
    version = "3.11.3";
    venv.enable = true;
    venv.requirements = ./requirements.txt;
    manylinux.packages = [ pkgs.cairo ];
  };

  processes.docs.exec = "mkdocs serve";
}

Refs #773 #715 #555

Atry commented 9 months ago

By the way, we can force linking python3 with libstdc++ and other manylinux required libraries to support manylinux ABI out-of-the-box without LD_LIBRARY_PATH

languages.python.package =
  pkgs.python3.override (old: {
    self = old.self.overrideAttrs
      (self: super: {
        env = super.env // {
          # Link libstdc++ to python interpreter so that packages in manylinux ABI can find it out-of-the-box without LD_LIBRARY_PATH
          # TODO: Add more libraries here when encountering an ImportError
          NIX_LDFLAGS = "--no-as-needed -lstdc++ --as-needed ${super.env.NIX_LDFLAGS}";

        };
      });
  });
Atry commented 9 months ago

By the way, a Python package requiring cairo should not be considered as manylinux* platform.

To be eligible for the manylinux1 platform tag, a Python wheel must therefore both (a) contain binary executables and compiled code that links only to libraries with SONAMEs included in the following list:

https://peps.python.org/pep-0513/#:~:text=compatible%20kernel%20ABI.-,To%20be%20eligible%20for%20the%20manylinux1%20platform%20tag%2C%20a%20Python%20wheel%20must%20therefore%20both%20(a)%20contain%20binary%20executables%20and%20compiled%20code%20that%20links%20only%20to%20libraries%20with%20SONAMEs%20included%20in%20the%20following%20list%3A,-libpanelw.so

domenkozar commented 9 months ago

By the way, we can force linking python3 with libstdc++ and other manylinux required libraries to support manylinux ABI out-of-the-box without LD_LIBRARY_PATH

languages.python.package =
  pkgs.python3.override (old: {
    self = old.self.overrideAttrs
      (self: super: {
        env = super.env // {
          # Link libstdc++ to python interpreter so that packages in manylinux ABI can find it out-of-the-box without LD_LIBRARY_PATH
          # TODO: Add more libraries here when encountering an ImportError
          NIX_LDFLAGS = "--no-as-needed -lstdc++ --as-needed ${super.env.NIX_LDFLAGS}";

        };
      });
  });

We could, however that means anyone using Python would be forced to recompile (or set up our cache).

I'd like to avoid that, but so far it doesn't seem to be possible without a lot of pain.

Atry commented 9 months ago

How about this?

languages.python.package =
  pkgs.python3.override (old: {
    self = callPackage "${nixpkgs}/pkgs/development/interpreters/python/wrapper.nix" {
      python = old.self;
      makeWrapperArgs = [
        "--set"
        "LD_LIBRARY_PATH"
        "${pkgs.stdenv.cc.cc.lib}/lib"
      ];
    };
  });
Atry commented 9 months ago

Ideally, we will need a way to set a environment variable to specify the fallback library path instead of LD_LIBRARY_PATH. Unfortunately the ELF interpreter in glibc does not support such an environment variable.

https://github.com/NixOS/nix/issues/902

domenkozar commented 9 months ago

How about this?

languages.python.package =
  pkgs.python3.override (old: {
    self = callPackage "${nixpkgs}/pkgs/development/interpreters/python/wrapper.nix" {
      python = old.self;
      makeWrapperArgs = [
        "--set"
        "LD_LIBRARY_PATH"
        "${pkgs.stdenv.cc.cc.lib}/lib"
      ];
    };
  });

Thanks for this! I've implemented it on the #745 branch. I'm still not able to get libstd++ to load, but it's good progress!

Atry commented 9 months ago

I didn't test makeWrapperArgs, so it might not work as expected. I did test the --no-as-needed -lstdc++ --as-needed flag, which actually works.

domenkozar commented 9 months ago

It works if I just use pkgs.stdenv.cc.cc.lib, but not if I point it to the .devenv/profile/lib :thinking:

bobvanderlinden commented 8 months ago

I recently tried using autopatchelf. It is very slow atm, but it does seem to work:

enterShell = ''
    python_package_libs=($VIRTUAL_ENV/${config.languages.python.package.sitePackages}/*.libs)
    libs=${lib.makeLibraryPath [ pkgs.glibc pkgs.glib pkgs.alsa-lib pkgs.gcc-unwrapped.lib pkgs.libuuid pkgs.openssl_1_1 pkgs.libz pkgs.gcc pkgs.gst_all_1.gstreamer ]}:$(IFS=:; echo "''${python_package_libs[*]}") autopatchelf $VIRTUAL_ENV
'';

This will patch all .so files in the virtualenv. It'll lookup the libraries needed for the .so in the libs envvar and patch the relative paths in the elf to use the absolute ones (/nix/store/*).

autopatchelf will error when it cannot resolve a library from its libs variable, so with this method it would make you aware of the missing libraries and forces you to add them.

python_package_libs is currently also needed, because there are python packages that ship with libraries that should be linked to as well. I ran into cases like libgomp-a34b3233.so.1.0.0, which is apparently a very specific version of libgomp that one of the python packages was shipping. I think these shouldn't really be needed, but autopatchelf cannot find these libraries and thus errors, where-as the python application can already find these libraries without patching.

Currently I haven't looked into caching the results, so it will try to patch all binaries in the virtualenv each time I enter the shell. This takes minutes. I think caching would help, but then still each change of the poetry.lock or the libs would require those minutes. autopatchelf is currently a bash script, which is partly why it is slow I think (it checks every file whether it is an ELF using file $path).

That said, this might be a good concession between using LD_LIBRARY_PATH and rebuilding native dependencies.

EDIT: Also good to mention: during the process of finding out which libraries the packages needed by rerunning autopatchelf and looking at the error it outputs libX.so not found I used nix-locate and echo /nix/store/*/lib/libX.so to find the package that contained the library. This may also be automated to some extend: list the libraries needed and lookup the packages for each of them.

rupurt commented 3 months ago

@Atry I like the idea of using NIX_LDFLAGS. Perhaps it doesn't fit with devenv but I feel like it's a good solution if you're a Nix user and willing to manage your own devshell.

Unfortunately it doesn't work with jedi though and I think that would be the same for any custom build of the python interpreter. It raises the following error:

jedi.api.environment.InvalidPythonEnvironment: The python binary is potentially unsafe.

Do you have any suggestions for a workaround? i.e. how to mark the binary in Nix as safe?

domenkozar commented 1 month ago

This shipped in https://devenv.sh/blog/2023/03/20/devenv-10-rewrite-in-rust/