jupyter-xeus / xeus-python-wheel

Building a PyPI wheel for xeus-python
BSD 3-Clause "New" or "Revised" License
12 stars 13 forks source link

Sys.path setup issues #30

Closed SylvainCorlay closed 4 years ago

SylvainCorlay commented 4 years ago

When using the conda package for xeus-python, xeus-python links with libpython.so and the sys path is set automatically to what the path would be in the case of normal Python usage.

However, we can't rely on there being a libpython.so when building a wheel, because some Python distribution do not even include one.

Hence, what we do is that when building the wheel, we embed python in the xpython executable by linking statically with libpython.a. Now what this means is that the xeus-python wheel litterally includes the built python interpreter from the manylinux environment, which is a "stock build" of Python with standard options. Thankfully, it is binary compatible with most operating systems, but it does not include the patches to CPython that various distributions of Python include.

The way we handle this currentlt in the wheel is by manually setting the sys.path to all the possible relative paths to the installation prefix that we encounter.

But this does not cover at all the case of virual environments, for this we need to

SylvainCorlay commented 4 years ago

The course of action in the short term would be to support Debian's case by expanding more the hard coded sys path.

Then, we should look into how core Python sets sys.path, or fetching it directly by calling into the Python executable...

SylvainCorlay commented 4 years ago

Added Debian's dist-packages in #32

Now the dynamic version will probably rely on reading a lot of https://github.com/python/cpython/blob/3.8/Lib/site.py

SylvainCorlay commented 4 years ago

I am curious about how the Panda3d folks do this - since they are also using a forked manylinux2010 docker image with libpython.a

cc @rdb @Moguri

rdb commented 4 years ago

@SylvainCorlay We don't deal with this problem because our use case relies on the bundled executable being completely self-contained and isolated. To that effect, we embed a frozen version of the standard Python library in the executable, and we also set Py_NoUserSiteDirectory. Python libraries existing in the system are completely ignored.

In your place I would also be worried that (eg.) the Fedora-patched version of the Python interpreter relies on a Fedora-patched version of the standard library, or vice versa, so I would be careful about assuming that you can mix-and-match those.

SylvainCorlay commented 4 years ago

In your place I would also be worried that (eg.) the Fedora-patched version of the Python interpreter relies on a Fedora-patched version of the standard library, or vice versa, so I would be careful about assuming that you can mix-and-match those.

Yes, this happened to us with the conda-patched version of CPython - but we managed to work around it.

[rant] Why can't people simply package the software as-is?

SylvainCorlay commented 4 years ago

Thanks for dropping by in any case, this is really nice of you to respond.

SylvainCorlay commented 4 years ago

I think that what we want is for site.__main__ to be called in the target environment, but for it to be found, the base path should be set already.

joshbode commented 4 years ago

note: -S defers import of site.py (until explicitly imported)

$ python -Sc "import sys; print('--- sys.path', *sys.path, sep='\n'); import site; print('--- venv()', *site.venv(set()), sep='\n'); print('--- sys.path', *sys.path, sep='\n')"
--- sys.path

/usr/lib/python38.zip
/usr/lib/python3.8
/usr/lib/python3.8/lib-dynload
--- venv()
/home/josh/.virtualenvs/default-3.8/src/black
/home/josh/repos/vimception
/home/josh/.virtualenvs/default-3.8/lib/python3.8/site-packages/_pdbpp_path_hack
/home/josh/.virtualenvs/default-3.8/src/python-xmltv
/home/josh/.virtualenvs/default-3.8/lib/python3.8/site-packages
--- sys.path
/home/josh/.virtualenvs/default-3.8/lib/python3.8/site-packages/_pdbpp_path_hack

/usr/lib/python38.zip
/usr/lib/python3.8
/usr/lib/python3.8/lib-dynload
/home/josh/.virtualenvs/default-3.8/lib/python3.8/site-packages
/home/josh/repos/vimception
/home/josh/.virtualenvs/default-3.8/src/black
/home/josh/.virtualenvs/default-3.8/src/python-xmltv

note: the ~/.virtualenvs/default-3.8/src entries are editable packages and pdbpp is a hook added by the excellent pdb++ package to override/extend pdb

SylvainCorlay commented 4 years ago

One way to do this may be to set Py_NoSiteFlag to one before calling Py_Initialize to prevent the import of site.py by Py_Initialize, and then manually importing site.py at a later stage.

The manual import is normally done with PyImport_ImportModule("site");

auto m = PyImport_ImportModule("site");
if (!m) {
    PyErr_Print(); // Unable to import Python site module
}
Py_XDECREF(m);

or a more pybind version: const py::object site_module{py::handle<>(PyImport_ImportModule("site"))};

But I presume this will not work with the distribution patches that change the location of site.py such as Fedora. In which case we could explicitly do

s.sprintf( 
  "import imp\n" 
  "imp.load_source('%s', r'%s')", modname, script_path);  // try with all the possible paths for site.py
PyRun_SimpleString(s.c_str()); 
SylvainCorlay commented 4 years ago

There seems to be a lower-level API in CPython for multi-phase initialization of the interpreter, introduced by @ncoghlan in PEP 489.

This may be what we should be using.

SylvainCorlay commented 4 years ago

Trying again the bare wheel without the Py_SetPath on ubuntu and the issue is that Debian-based systems default to user install, and the xpython executable gets placed in to ~/.local/bin/, and ~/.local/lib only contains user-installed packages. site.py is then not found

Forcing a system install, all gets into /usr/local, but since python is in /usr/bin, the site.py is looked for in /usr/local/lib/pythonX.X instead of /usr/lib/pythonX.X

SylvainCorlay commented 4 years ago

Running Conda's Python with the /usr as PYTHONHOME also fails on ubuntu.

So my issue is not specific to our wheel.

(base) sylvain@mbp ~/dev/jupyter-xeus/xeus-python (pythonhome-variable)$ python
Python 3.7.3 (default, Mar 27 2019, 22:11:17) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import jedi
>>> 
(reverse-i-search)`PYTHONHOM': git commit -m "Support the use of the ^CTHONHOME environment variable"
(base) sylvain@mbp ~/dev/jupyter-xeus/xeus-python (pythonhome-variable)$ cd
(base) sylvain@mbp ~$ PYTHONHOME=/usr/ python
Python 3.7.3 (default, Mar 27 2019, 22:11:17) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import jedi
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'jedi'
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 63, in apport_excepthook
    from apport.fileutils import likely_packaged, get_recent_crashes
  File "/usr/lib/python3/dist-packages/apport/__init__.py", line 5, in <module>
    from apport.report import Report
  File "/usr/lib/python3/dist-packages/apport/report.py", line 12, in <module>
    import subprocess, tempfile, os.path, re, pwd, grp, os, time, io
  File "/home/sylvain/subprocess.py", line 136, in <module>
    import _posixsubprocess
ModuleNotFoundError: No module named '_posixsubprocess'

Original exception was:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'jedi'
SylvainCorlay commented 4 years ago

Closing as fixed in 0.7.1.