Closed msavtchouk-pf closed 6 months ago
Reproduces in pure pex:
$ python -m pex gmsh==4.12.1 -o out.pex
$ ./out.pex -c "import gmsh"
Warning: could not find Gmsh shared library libgmsh.4.12.dylib with ctypes.util.find_library() or in the following locations: [<snip>]
@jsirois I'll take a look at this as an exercise in getting more familiar with this part of the Pex code, if that's ok with you.
@benjyw sounds great. I'll not provide hints - just speak up if you need any aid.
Yes, no spoilers!
OK, got around to looking at this properly. The issue is that gmsh.py
computes the location of the shared library here:
moduledir = os.path.dirname(os.path.realpath(__file__))
...
possible_libpaths = [
os.path.join(moduledir, libname),
...,
<many other possible paths relative to moduledir>,
...,
]
But since gmsh.py
is a symlink into installed_wheels
(either from under $PEX_ROOT/unzipped_pexes
or $PEX_ROOT/venvs
, depending on --venv
mode), os.path.realpath
sends us into the installed_wheels
dir, where the data dirs are under the chroot and not splatted out to their final, expected destinations.
The one case that does work is --layout=loose --venv
, which makes sense since we don't symlink in that case.
In the --venv
case, --venv-site-packages-copies
is a workaround.
In the non --venv
case, will need a similar escape hatch.
@msavtchouk-pf Can you check if this workaround works for you:
On your pex_binary
target, set execution_mode=venv
and venv_site_packages_copies=True
(see https://www.pantsbuild.org/2.18/reference/targets/pex_binary for documentation of these options)
There is no escape hatch for this except a venv. Hopefully your journey took you down the road of reading https://peps.python.org/pep-0427/#installing-a-wheel-distribution-1-0-py32-none-any-whl
When a dist like gmsh uses data, you run into issues for these cases:
1st, PEX aside, at a conceptual level for the distribution-1.0.data/
items:
purelib/
: OK sysconfig.get_paths()["purelib"]platlib/
: OK sysconfig.get_paths()["platlib"]headers/
: BROKEN by design and not Ok. [^1]scripts/
: OK sysconfig.get_paths()["scripts"]data/
: OK sysconfig.get_paths()["data"]For a PEX wheel chroot, the 1st 2 are handled since both "purelib" and "platlib" are effectively synonyms for sys.path
and the PEX runtime places each resolved installed wheel chroot on the sys.path
before handing off to user code. Of the remaining 2 (since "headers" is broken period for venvs as well [^1]):
sys.path
is a list, and, similarly "scripts" which are multihome via PATH
which is also a list, "data" can only be sanely looked up relative to the single value for sysconfig.get_paths()["data"]. This is an impedance mismatch with a zipapp-style PEX execution where there are numerous wheel chroots composed. You might do something clever like creating a link farm in a tmp dir and mutating sysconfig.get_paths()["data"]
for the current interpreter. That would fail though when symlinks confused libraries (or would be slow otherwise using copies) and in either case it would fail when user code re-execs using sys.executable.This is all a very long winded way to say this is why I introduced --venv
et. al. ~2 years ago now. There is simply no way to make PEX zipapp mode compatible with all user code. People should take to heart Pants and Pex docs here and use --venv mode always (or at the 1st whiff of trouble) unless they are sensitive to cold boot times, which is the only advantage PEX zipapp mode provides, and is the only reason I did not default to it when I added the feature.
[^1]: A comment I wrote from a branch I'm working on:
# We match the value Pip concocts in `pip._internal.locations` since Pip is the de-facto
# standard people expect in the vacuum of Python / PyPA issuing more "MUST"y PEPs.
#
# The "headers" install scheme path is basically invalid today for venvs as tracked by:
# + https://github.com/python/cpython/issues/88611
# + https://discuss.python.org/t/deprecating-the-headers-wheel-data-key/23712/1
#
# The basic thrust in this Feb 2023 conversation is typical of the PyPA, roughly:
#
# > PyPA member:
# https://discuss.python.org/t/deprecating-the-headers-wheel-data-key/23712/7
# If NumPy has figured out how to make this work, it must be possible, so lets call it
# good then.
# > Core SymPy maintainer:
# https://discuss.python.org/t/deprecating-the-headers-wheel-data-key/23712/8
# NumPy has figured it out, but by totally working around the longstanding Python /
# PyPA non-solution.
# > Core Python member:
# https://discuss.python.org/t/deprecating-the-headers-wheel-data-key/23712/11
# I think the current situation is actually ideal!
# > NumPy lead:
# https://discuss.python.org/t/deprecating-the-headers-wheel-data-key/23712/12
# We do insane BS to make this work which requires folks wishing to build against us
# to do further insane bs (c.f. numerous `setup_requires` / `setup.py` fiascoes
# ameliorated by PEP-518).
#
# The conversation bits I snipped contrast 2 non Python core / non PyPA members who know
# what's going on against a Python core maintainer and a PyPA maintainer who do not.
# This 2023 conversation ended with no resulting action, which contrasts with the real
# use case Pex fixed in https://github.com/pantsbuild/pex/issues/1656 in 2022 which
# originated at least as far back as 2010 in greenlet (
# https://github.com/python-greenlet/greenlet/commit/93abb2fc95ef99527bed858966b8af457f3dc0a5#diff-60f61ab7a8d1910d86d9fda2261620314edcae5894d5aaa236b821c7256badd7R78)
# which uses `setup(headers=...)` to attempt to allow other C-extensions (uwsgi is an
# example) to link to it during sdist builds. Although
# https://github.com/python-greenlet/greenlet/issues/96 is closed, the "solution" there
# was to manually include the well-known Pip venv include location via ~:
# `CFLAGS="-I/tmp/py3.5/include/site/python3.5" pip ...`
Thanks for looking at this @benjyw. I want to highlight this is not really a workaround. This is the go-to that folks should default to unless they have compelling reasons not to.
FWIW, gmsh is making its life hard by using data to package its .so
and doing all the probing it does instead of just placing it in the ~root of the wheel (found in sys.path
) which is what nearly every platform-specific wheel does. I'm having a hard time imagining a principled reason they do this. I suspect it is just fallout of the confusion, tangle, and ever-changing landscape that is Python packaging standards.
There is no escape hatch for this except a venv. Hopefully your journey took you down the road of reading https://peps.python.org/pep-0427/#installing-a-wheel-distribution-1-0-py32-none-any-whl
It sure did (and I had also read that when reviewing your wheel install changes originally).
Great. @benjyw I'll tell you that branch comment above was the 1st time (~a week ago), that I finally felt confident I understood wheels after ~5 years of involvement. I always thought I was just dumb not getting the headers thing. I finally searched a bit wider and found out there is nothing to understand - it is broken :/.
FWIW, gmsh is making its life hard by using data to package its
.so
and doing all the probing it does instead of just placing it in the ~root of the wheel (found insys.path
) which is what nearly every platform-specific wheel does. I'm having a hard time imagining a principled reason they do this. I suspect it is just fallout of the confusion, tangle, and ever-changing landscape that is Python packaging standards.
Yeah, I did some shallow diving into why they do this, and I think it's just confusion / this is a python wrapper written by c people.
Thanks for looking at this @benjyw. I want to highlight this is not really a workaround. This is the go-to that folks should default to unless they have compelling reasons not to.
I can see recommending --venv
~always (we could even move towards making --venv
the default in pants with appropriate deprecation cycle), but are you also suggesting that we recommend/move towards --venv-site-packages-copies
as the standard?
Great. @benjyw I'll tell you that branch comment above was the 1st time (~a week ago), that I finally felt confident I understood wheels after ~5 years of involvement. I always thought I was just dumb not getting the headers thing. I finally searched a bit wider and found out there is nothing to understand - it is broken :/.
I feel your pain. It's one thing to learn by reading an RFC or a standard of some kind, it's a lot less fun when there is no standard and the relevant knowledge is "some arbitrary subset of the current state of the world".
I can see recommending --venv ~always (we could even move towards making --venv the default in pants with appropriate deprecation cycle), but are you also suggesting that we recommend/move towards --venv-site-packages-copies as the standard?
All I'm saying is the most venv-like configuration is exactly all of:
--venv prepend --venv-site-packages-copies --non-hermetic-venv-scripts
Every option you trim lets in differences from a standard venv and thus potential breaks for apps. I won't comment on what Pants should do.
@msavtchouk-pf, Benjy and I went on, but I wanted to draw attention back to the solution @benjyw presented here: https://github.com/pantsbuild/pex/issues/2336#issuecomment-1901143926
Can you please test that and report back? I'd love to close this issue as an answered question assuming that does, in fact, answer your question / problem.
Hey! Thanks so much for looking into it. I still have the same error, unfortunately:
2024-01-26 13:33:20.621 CET
import gmsh
2024-01-26 13:33:20.621 CET
File "/root/.pex/venvs/9fe0ec941045a50fbda7bdef0455c29aa2e40e34/779eb2cc0ca9e2fdd204774cbc41848e4e7c5055/lib/python3.10/site-packages/gmsh.py", line 87, in
The workaround I found is installing libgmsh-dev gmsh in the container where I run.
@msavtchouk-pf although the error looks similar, it's completely different. I believe libGLU.so.1
is not provided by any wheel, it is only provided by the host OS. In other words, you should see an identical issue using a standard venv + Pip. You can install gmsh in that venv using Pip, but it will also fail to load libGLU.so.1
unless you 1st install the native package for it (libglu1-mesa
for Ubuntu 22.04 for example). Pex does not (cannot) solve that style of host dependency. It can only handle self contained wheels hermetically. Non self contained wheels will always need their external host dependencies manually installed out of band.
@msavtchouk-pf for example:
$ docker run --rm -it python:3.10 bash -c 'python3.10 -mvenv example.venv && source example.venv/bin/activate && pip install -U pip && pip install gmsh && python -c "import gmsh"'
Requirement already satisfied: pip in /example.venv/lib/python3.10/site-packages (23.0.1)
Collecting pip
Downloading pip-23.3.2-py3-none-any.whl (2.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 11.0 MB/s eta 0:00:00
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 23.0.1
Uninstalling pip-23.0.1:
Successfully uninstalled pip-23.0.1
Successfully installed pip-23.3.2
Collecting gmsh
Downloading gmsh-4.12.2-py2.py3-none-manylinux_2_24_x86_64.whl.metadata (1.7 kB)
Downloading gmsh-4.12.2-py2.py3-none-manylinux_2_24_x86_64.whl (39.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.3/39.3 MB 13.9 MB/s eta 0:00:00
Installing collected packages: gmsh
Successfully installed gmsh-4.12.2
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/example.venv/lib/python3.10/site-packages/gmsh.py", line 87, in <module>
lib = CDLL(libpath)
File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libGLU.so.1: cannot open shared object file: No such file or directory
To correct, I need to add some native package installs, namely libglu1-mesa
, libgl1
, libxcursor1
and libxinerama1
for Debian 12.4 bookworm:
$ docker run --rm -it python:3.10 bash -c 'apt update && apt install -y libglu1-mesa libgl1 libxcursor1 libxinerama1 && python3.10 -mvenv example.venv && source example.venv/bin/activate && pip install -U pip && pip install gmsh && python -c "import gmsh"'
...
Setting up libxcb-dri3-0:amd64 (1.15-1) ...
Setting up libx11-xcb1:amd64 (2:1.8.4-2+deb12u2) ...
Setting up libpciaccess0:amd64 (0.17-2) ...
Setting up libxcb-xfixes0:amd64 (1.15-1) ...
Setting up libglvnd0:amd64 (1.6.0-1) ...
Setting up libxcb-glx0:amd64 (1.15-1) ...
Setting up libsensors-config (1:3.6.0-7.1) ...
Setting up libopengl0:amd64 (1.6.0-1) ...
Setting up libxxf86vm1:amd64 (1:1.1.4-1+b2) ...
Setting up libxcb-present0:amd64 (1.15-1) ...
Setting up libz3-4:amd64 (4.8.12-3.1) ...
Setting up libxfixes3:amd64 (1:6.0.0-2) ...
Setting up libxcb-sync1:amd64 (1.15-1) ...
Setting up libxinerama1:amd64 (2:1.1.4-3) ...
Setting up libsensors5:amd64 (1:3.6.0-7.1) ...
Setting up libglapi-mesa:amd64 (22.3.6-1+deb12u1) ...
Setting up libxcb-dri2-0:amd64 (1.15-1) ...
Setting up libxshmfence1:amd64 (1.3-1) ...
Setting up libxcb-randr0:amd64 (1.15-1) ...
Setting up libllvm15:amd64 (1:15.0.6-4+b1) ...
Setting up libglu1-mesa:amd64 (9.0.2-1.1) ...
Setting up libdrm-common (2.4.114-1) ...
Setting up libxcursor1:amd64 (1:1.2.1-1) ...
Setting up libdrm2:amd64 (2.4.114-1+b1) ...
Setting up libdrm-amdgpu1:amd64 (2.4.114-1+b1) ...
Setting up libdrm-nouveau2:amd64 (2.4.114-1+b1) ...
Setting up libdrm-radeon1:amd64 (2.4.114-1+b1) ...
Setting up libdrm-intel1:amd64 (2.4.114-1+b1) ...
Setting up libgl1-mesa-dri:amd64 (22.3.6-1+deb12u1) ...
Setting up libglx-mesa0:amd64 (22.3.6-1+deb12u1) ...
Setting up libglx0:amd64 (1.6.0-1) ...
Setting up libgl1:amd64 (1.6.0-1) ...
Processing triggers for libc-bin (2.36-9+deb12u3) ...
Requirement already satisfied: pip in /example.venv/lib/python3.10/site-packages (23.0.1)
Collecting pip
Downloading pip-23.3.2-py3-none-any.whl (2.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 10.9 MB/s eta 0:00:00
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 23.0.1
Uninstalling pip-23.0.1:
Successfully uninstalled pip-23.0.1
Successfully installed pip-23.3.2
Collecting gmsh
Downloading gmsh-4.12.2-py2.py3-none-manylinux_2_24_x86_64.whl.metadata (1.7 kB)
Downloading gmsh-4.12.2-py2.py3-none-manylinux_2_24_x86_64.whl (39.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.3/39.3 MB 14.1 MB/s eta 0:00:00
Installing collected packages: gmsh
Successfully installed gmsh-4.12.2
@msavtchouk-pf with this extra information / detail about what Pex (and Pip) can and cannot do, is your question answered? I think if you wanted handling of native packages as well, the only solution today is Conda which I know little about.
Thanks a lot for your help! It answers my question.
I use pants package to get a .pex python binary. In requirements file I have a dependency:
gmsh==4.12.1
When I execute the binary, I see the following problemI can see indeed that the library is not in the root directory or
/lib
subdir. It is in a/.prefix/lib
subdir. In the root dir there is a.layout.json
file located. Contents are{"record_relpath": "gmsh-4.12.1.dist-info/RECORD", "stash_dir": ".prefix"}
Reproducible on Ubuntu with
pants help pex-cli
2.1.137