pex-tool / pex

A tool for generating .pex (Python EXecutable) files, lock files and venvs.
https://docs.pex-tool.org/
Apache License 2.0
2.5k stars 257 forks source link

gmsh library not loaded in the packaged pex python binary #2336

Closed msavtchouk-pf closed 6 months ago

msavtchouk-pf commented 6 months ago

I use pants package to get a .pex python binary. In requirements file I have a dependency: gmsh==4.12.1 When I execute the binary, I see the following problem

Warning: could not find Gmsh shared library libgmsh.so.4.12 with ctypes.util.find_library() or in the following locations: ['/root/.pex/installed_wheels/5b4820c98b3a9fff6fe08010f8dfd52a77ca6ecc46b0a1212245061298862c60/gmsh-4.12.1-py2.py3-none-manylinux_2_24_x86_64.whl/libgmsh.so.4.12', '/root/.pex/installed_wheels/5b4820c98b3a9fff6fe08010f8dfd52a77ca6ecc46b0a1212245061298862c60/gmsh-4.12.1-py2.py3-none-manylinux_2_24_x86_64.whl/lib/libgmsh.so.4.12', '/root/.pex/installed_wheels/5b4820c98b3a9fff6fe08010f8dfd52a77ca6ecc46b0a1212245061298862c60/gmsh-4.12.1-py2.py3-none-manylinux_2_24_x86_64.whl/Lib/libgmsh.so.4.12', '/root/.pex/installed_wheels/5b4820c98b3a9fff6fe08010f8dfd52a77ca6ecc46b0a1212245061298862c60/libgmsh.so.4.12', '/root/.pex/installed_wheels/5b4820c98b3a9fff6fe08010f8dfd52a77ca6ecc46b0a1212245061298862c60/lib/libgmsh.so.4.12', '/root/.pex/installed_wheels/5b4820c98b3a9fff6fe08010f8dfd52a77ca6ecc46b0a1212245061298862c60/Lib/libgmsh.so.4.12', '/root/.pex/installed_wheels/libgmsh.so.4.12', '/root/.pex/installed_wheels/lib/libgmsh.so.4.12', '/root/.pex/installed_wheels/Lib/libgmsh.so.4.12']
When I expect the mentioned directory  '/root/.pex/installed_wheels/5b4820c98b3a9fff6fe08010f8dfd52a77ca6ecc46b0a1212245061298862c60/gmsh-4.12.1-py2.py3-none-manylinux_2_24_x86_64.whl

I can see indeed that the library is not in the root directory or/lib subdir. It is in a /.prefix/lib subdir. In the root dir there is a .layout.jsonfile located. Contents are{"record_relpath": "gmsh-4.12.1.dist-info/RECORD", "stash_dir": ".prefix"}

Reproducible on Ubuntu with pants help pex-cli 2.1.137

benjyw commented 6 months ago

Reproduces in pure pex:

$ python -m pex gmsh==4.12.1 -o out.pex
$ ./out.pex -c "import gmsh"
Warning: could not find Gmsh shared library libgmsh.4.12.dylib with ctypes.util.find_library() or in the following locations: [<snip>]
benjyw commented 6 months ago

@jsirois I'll take a look at this as an exercise in getting more familiar with this part of the Pex code, if that's ok with you.

jsirois commented 6 months ago

@benjyw sounds great. I'll not provide hints - just speak up if you need any aid.

benjyw commented 6 months ago

Yes, no spoilers!

benjyw commented 6 months ago

OK, got around to looking at this properly. The issue is that gmsh.py computes the location of the shared library here:

moduledir = os.path.dirname(os.path.realpath(__file__))
...
possible_libpaths = [
    os.path.join(moduledir, libname), 
    ...,
    <many other possible paths relative to moduledir>,
    ...,
]

But since gmsh.py is a symlink into installed_wheels (either from under $PEX_ROOT/unzipped_pexes or $PEX_ROOT/venvs, depending on --venv mode), os.path.realpath sends us into the installed_wheels dir, where the data dirs are under the chroot and not splatted out to their final, expected destinations.

The one case that does work is --layout=loose --venv, which makes sense since we don't symlink in that case.

benjyw commented 6 months ago

In the --venv case, --venv-site-packages-copies is a workaround.

In the non --venv case, will need a similar escape hatch.

benjyw commented 6 months ago

@msavtchouk-pf Can you check if this workaround works for you:

On your pex_binary target, set execution_mode=venv and venv_site_packages_copies=True (see https://www.pantsbuild.org/2.18/reference/targets/pex_binary for documentation of these options)

jsirois commented 6 months ago

There is no escape hatch for this except a venv. Hopefully your journey took you down the road of reading https://peps.python.org/pep-0427/#installing-a-wheel-distribution-1-0-py32-none-any-whl

When a dist like gmsh uses data, you run into issues for these cases:

1st, PEX aside, at a conceptual level for the distribution-1.0.data/ items:

For a PEX wheel chroot, the 1st 2 are handled since both "purelib" and "platlib" are effectively synonyms for sys.path and the PEX runtime places each resolved installed wheel chroot on the sys.path before handing off to user code. Of the remaining 2 (since "headers" is broken period for venvs as well [^1]):

This is all a very long winded way to say this is why I introduced --venv et. al. ~2 years ago now. There is simply no way to make PEX zipapp mode compatible with all user code. People should take to heart Pants and Pex docs here and use --venv mode always (or at the 1st whiff of trouble) unless they are sensitive to cold boot times, which is the only advantage PEX zipapp mode provides, and is the only reason I did not default to it when I added the feature.

[^1]: A comment I wrote from a branch I'm working on:

    # We match the value Pip concocts in `pip._internal.locations` since Pip is the de-facto
    # standard people expect in the vacuum of Python / PyPA issuing more "MUST"y PEPs.
    #
    # The "headers" install scheme path is basically invalid today for venvs as tracked by:
    # + https://github.com/python/cpython/issues/88611
    # + https://discuss.python.org/t/deprecating-the-headers-wheel-data-key/23712/1
    #
    # The basic thrust in this Feb 2023 conversation is typical of the PyPA, roughly:
    #
    # > PyPA member:
    #   https://discuss.python.org/t/deprecating-the-headers-wheel-data-key/23712/7
    #   If NumPy has figured out how to make this work, it must be possible, so lets call it
    #   good then.
    # > Core SymPy maintainer:
    #   https://discuss.python.org/t/deprecating-the-headers-wheel-data-key/23712/8
    #   NumPy has figured it out, but by totally working around the longstanding Python /
    #   PyPA non-solution.
    # > Core Python member:
    #   https://discuss.python.org/t/deprecating-the-headers-wheel-data-key/23712/11
    #   I think the current situation is actually ideal!
    # > NumPy lead:
    #   https://discuss.python.org/t/deprecating-the-headers-wheel-data-key/23712/12
    #   We do insane BS to make this work which requires folks wishing to build against us
    #   to do further insane bs (c.f. numerous `setup_requires` / `setup.py` fiascoes
    #   ameliorated by PEP-518).
    #
    # The conversation bits I snipped contrast 2 non Python core / non PyPA members who know
    # what's going on against a Python core maintainer and a PyPA maintainer who do not.
    # This 2023 conversation ended with no resulting action, which contrasts with the real
    # use case Pex fixed in https://github.com/pantsbuild/pex/issues/1656 in 2022 which
    # originated at least as far back as 2010 in greenlet (
    # https://github.com/python-greenlet/greenlet/commit/93abb2fc95ef99527bed858966b8af457f3dc0a5#diff-60f61ab7a8d1910d86d9fda2261620314edcae5894d5aaa236b821c7256badd7R78)
    # which uses `setup(headers=...)` to attempt to allow other C-extensions (uwsgi is an
    # example) to link to it during sdist builds. Although
    # https://github.com/python-greenlet/greenlet/issues/96 is closed, the "solution" there
    # was to manually include the well-known Pip venv include location via ~:
    # `CFLAGS="-I/tmp/py3.5/include/site/python3.5" pip ...`
jsirois commented 6 months ago

Thanks for looking at this @benjyw. I want to highlight this is not really a workaround. This is the go-to that folks should default to unless they have compelling reasons not to.

jsirois commented 6 months ago

FWIW, gmsh is making its life hard by using data to package its .so and doing all the probing it does instead of just placing it in the ~root of the wheel (found in sys.path) which is what nearly every platform-specific wheel does. I'm having a hard time imagining a principled reason they do this. I suspect it is just fallout of the confusion, tangle, and ever-changing landscape that is Python packaging standards.

benjyw commented 6 months ago

There is no escape hatch for this except a venv. Hopefully your journey took you down the road of reading https://peps.python.org/pep-0427/#installing-a-wheel-distribution-1-0-py32-none-any-whl

It sure did (and I had also read that when reviewing your wheel install changes originally).

jsirois commented 6 months ago

Great. @benjyw I'll tell you that branch comment above was the 1st time (~a week ago), that I finally felt confident I understood wheels after ~5 years of involvement. I always thought I was just dumb not getting the headers thing. I finally searched a bit wider and found out there is nothing to understand - it is broken :/.

benjyw commented 6 months ago

FWIW, gmsh is making its life hard by using data to package its .so and doing all the probing it does instead of just placing it in the ~root of the wheel (found in sys.path) which is what nearly every platform-specific wheel does. I'm having a hard time imagining a principled reason they do this. I suspect it is just fallout of the confusion, tangle, and ever-changing landscape that is Python packaging standards.

Yeah, I did some shallow diving into why they do this, and I think it's just confusion / this is a python wrapper written by c people.

benjyw commented 6 months ago

Thanks for looking at this @benjyw. I want to highlight this is not really a workaround. This is the go-to that folks should default to unless they have compelling reasons not to.

I can see recommending --venv ~always (we could even move towards making --venv the default in pants with appropriate deprecation cycle), but are you also suggesting that we recommend/move towards --venv-site-packages-copies as the standard?

benjyw commented 6 months ago

Great. @benjyw I'll tell you that branch comment above was the 1st time (~a week ago), that I finally felt confident I understood wheels after ~5 years of involvement. I always thought I was just dumb not getting the headers thing. I finally searched a bit wider and found out there is nothing to understand - it is broken :/.

I feel your pain. It's one thing to learn by reading an RFC or a standard of some kind, it's a lot less fun when there is no standard and the relevant knowledge is "some arbitrary subset of the current state of the world".

jsirois commented 6 months ago

I can see recommending --venv ~always (we could even move towards making --venv the default in pants with appropriate deprecation cycle), but are you also suggesting that we recommend/move towards --venv-site-packages-copies as the standard?

All I'm saying is the most venv-like configuration is exactly all of:

--venv prepend --venv-site-packages-copies --non-hermetic-venv-scripts

Every option you trim lets in differences from a standard venv and thus potential breaks for apps. I won't comment on what Pants should do.

jsirois commented 6 months ago

@msavtchouk-pf, Benjy and I went on, but I wanted to draw attention back to the solution @benjyw presented here: https://github.com/pantsbuild/pex/issues/2336#issuecomment-1901143926

Can you please test that and report back? I'd love to close this issue as an answered question assuming that does, in fact, answer your question / problem.

msavtchouk-pf commented 6 months ago

Hey! Thanks so much for looking into it. I still have the same error, unfortunately: 2024-01-26 13:33:20.621 CET import gmsh 2024-01-26 13:33:20.621 CET File "/root/.pex/venvs/9fe0ec941045a50fbda7bdef0455c29aa2e40e34/779eb2cc0ca9e2fdd204774cbc41848e4e7c5055/lib/python3.10/site-packages/gmsh.py", line 87, in 2024-01-26 13:33:20.621 CET lib = CDLL(libpath) 2024-01-26 13:33:20.621 CET File "/workdir/.pyenv/versions/3.10.12/lib/python3.10/ctypes/init.py", line 374, in init 2024-01-26 13:33:20.621 CET self._handle = _dlopen(self._name, mode) 2024-01-26 13:33:20.621 CET OSError: libGLU.so.1: cannot open shared object file: No such file or directory

The workaround I found is installing libgmsh-dev gmsh in the container where I run.

jsirois commented 6 months ago

@msavtchouk-pf although the error looks similar, it's completely different. I believe libGLU.so.1 is not provided by any wheel, it is only provided by the host OS. In other words, you should see an identical issue using a standard venv + Pip. You can install gmsh in that venv using Pip, but it will also fail to load libGLU.so.1 unless you 1st install the native package for it (libglu1-mesa for Ubuntu 22.04 for example). Pex does not (cannot) solve that style of host dependency. It can only handle self contained wheels hermetically. Non self contained wheels will always need their external host dependencies manually installed out of band.

jsirois commented 6 months ago

@msavtchouk-pf for example:

$ docker run --rm -it python:3.10 bash -c 'python3.10 -mvenv example.venv && source example.venv/bin/activate && pip install -U pip && pip install gmsh && python -c "import gmsh"'
Requirement already satisfied: pip in /example.venv/lib/python3.10/site-packages (23.0.1)
Collecting pip
  Downloading pip-23.3.2-py3-none-any.whl (2.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 11.0 MB/s eta 0:00:00
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.0.1
    Uninstalling pip-23.0.1:
      Successfully uninstalled pip-23.0.1
Successfully installed pip-23.3.2
Collecting gmsh
  Downloading gmsh-4.12.2-py2.py3-none-manylinux_2_24_x86_64.whl.metadata (1.7 kB)
Downloading gmsh-4.12.2-py2.py3-none-manylinux_2_24_x86_64.whl (39.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.3/39.3 MB 13.9 MB/s eta 0:00:00
Installing collected packages: gmsh
Successfully installed gmsh-4.12.2
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/example.venv/lib/python3.10/site-packages/gmsh.py", line 87, in <module>
    lib = CDLL(libpath)
  File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libGLU.so.1: cannot open shared object file: No such file or directory

To correct, I need to add some native package installs, namely libglu1-mesa, libgl1, libxcursor1 and libxinerama1 for Debian 12.4 bookworm:

$ docker run --rm -it python:3.10 bash -c 'apt update && apt install -y libglu1-mesa libgl1 libxcursor1 libxinerama1 && python3.10 -mvenv example.venv && source example.venv/bin/activate && pip install -U pip && pip install gmsh && python -c "import gmsh"'
...
Setting up libxcb-dri3-0:amd64 (1.15-1) ...
Setting up libx11-xcb1:amd64 (2:1.8.4-2+deb12u2) ...
Setting up libpciaccess0:amd64 (0.17-2) ...
Setting up libxcb-xfixes0:amd64 (1.15-1) ...
Setting up libglvnd0:amd64 (1.6.0-1) ...
Setting up libxcb-glx0:amd64 (1.15-1) ...
Setting up libsensors-config (1:3.6.0-7.1) ...
Setting up libopengl0:amd64 (1.6.0-1) ...
Setting up libxxf86vm1:amd64 (1:1.1.4-1+b2) ...
Setting up libxcb-present0:amd64 (1.15-1) ...
Setting up libz3-4:amd64 (4.8.12-3.1) ...
Setting up libxfixes3:amd64 (1:6.0.0-2) ...
Setting up libxcb-sync1:amd64 (1.15-1) ...
Setting up libxinerama1:amd64 (2:1.1.4-3) ...
Setting up libsensors5:amd64 (1:3.6.0-7.1) ...
Setting up libglapi-mesa:amd64 (22.3.6-1+deb12u1) ...
Setting up libxcb-dri2-0:amd64 (1.15-1) ...
Setting up libxshmfence1:amd64 (1.3-1) ...
Setting up libxcb-randr0:amd64 (1.15-1) ...
Setting up libllvm15:amd64 (1:15.0.6-4+b1) ...
Setting up libglu1-mesa:amd64 (9.0.2-1.1) ...
Setting up libdrm-common (2.4.114-1) ...
Setting up libxcursor1:amd64 (1:1.2.1-1) ...
Setting up libdrm2:amd64 (2.4.114-1+b1) ...
Setting up libdrm-amdgpu1:amd64 (2.4.114-1+b1) ...
Setting up libdrm-nouveau2:amd64 (2.4.114-1+b1) ...
Setting up libdrm-radeon1:amd64 (2.4.114-1+b1) ...
Setting up libdrm-intel1:amd64 (2.4.114-1+b1) ...
Setting up libgl1-mesa-dri:amd64 (22.3.6-1+deb12u1) ...
Setting up libglx-mesa0:amd64 (22.3.6-1+deb12u1) ...
Setting up libglx0:amd64 (1.6.0-1) ...
Setting up libgl1:amd64 (1.6.0-1) ...
Processing triggers for libc-bin (2.36-9+deb12u3) ...
Requirement already satisfied: pip in /example.venv/lib/python3.10/site-packages (23.0.1)
Collecting pip
  Downloading pip-23.3.2-py3-none-any.whl (2.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 10.9 MB/s eta 0:00:00
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.0.1
    Uninstalling pip-23.0.1:
      Successfully uninstalled pip-23.0.1
Successfully installed pip-23.3.2
Collecting gmsh
  Downloading gmsh-4.12.2-py2.py3-none-manylinux_2_24_x86_64.whl.metadata (1.7 kB)
Downloading gmsh-4.12.2-py2.py3-none-manylinux_2_24_x86_64.whl (39.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.3/39.3 MB 14.1 MB/s eta 0:00:00
Installing collected packages: gmsh
Successfully installed gmsh-4.12.2

@msavtchouk-pf with this extra information / detail about what Pex (and Pip) can and cannot do, is your question answered? I think if you wanted handling of native packages as well, the only solution today is Conda which I know little about.

msavtchouk-pf commented 6 months ago

Thanks a lot for your help! It answers my question.