Closed gutow closed 2 years ago
The main issues I saw with making JUPYTER_PREFER_ENV_PATH set by default are:
Does anyone know of a way to reliably tell if sys.prefix comes from a virtual environment or not?
On the other points:
Only use versions available at the system level if they have not been overridden at the previous two levels.
That already happens in JupyterLab - extensions in earlier directories in the path override extensions in later directories.
There may need to be a switch to not use packages from the user or system level.
I think you can set JUPYTER_CONFIG_DIR and JUPYTER_DATA_DIR to override the user-level directories - possibly setting them to empty effectively turns off the user level. I don't know of a way to turn off the system level (and not sure there is a real-world use case in practice).
2. I didn't see a way to reliably tell if sys.prefix was from a virtual environment or a system installation - that's why we opted for an explicit user setting.
I'm confused by this. Does this mean you are overriding sys.prefix? If you are operating in a virtual environment sys.prefix should point to the virtual environment. Is there a way that it might not (other than the user specifically setting it otherwise). Thus, I believe the sys.prefix should take precedence.
From the python documentation:
sys.prefix
A string giving the site-specific directory prefix where the platform independent Python files are installed; on Unix, the default is '/usr/local'. This can be set at build time with the --prefix argument to the configure script. See Installation paths for derived paths.
Note
If a virtual environment is in effect, this value will be changed in site.py to point to the virtual environment. The value for the Python installation will still be available, via base_prefix.
Does this mean you are overriding sys.prefix?
No, what I mean is that sometimes sys.prefix is intended to be more specific than the user-level directory (for example, if a single person is using multiple virtual environments), and other times sys.prefix is intended to be less specific than the user-level directory (for example, if you are not using a virtual environment, sys.prefix points to a system location like /usr/local
shared by many users). I couldn't find a reliable way to tell if the user wants sys.prefix to be more or less specific than user-level directories.
One heuristic is to check if sys.prefix and the user-level directory share a common prefix, i.e., see if the sys.prefix points to a directory inside the user's home directory. That heuristic assumes the path location indicates the precedence. I don't know if that heuristic would be reliable enough to make default. For example, a user might change the user-level path to something outside their home directory, or might be using virtual environments based out of a directory outside the home directory.
From the docs you quoted, another heuristic might be to examine sys.base_prefix and sys.prefix. If they are different, assume we are running in a virtual environment and make sys.prefix more specific than user-level directories. This has the following issues:
I guess I am not understanding what the problem is. If someone is not using a virtual environment, sys.prefix should point to the proper directory to look for things in. If they are using a virtual environment it should point to the proper directory. Is the issue how to climb the tree above that looking for things?
- I think it doesn't handle the case of other virtual environment solutions like conda/mamba, where I think the sys.prefix and the sys.base_prefix will be the same since the whole python install is inside the virtual environment. I think this doesn't prevent us handling python venv better by default, but it would be nice if we could find a solution that handles both cases.
Even in this case, I do not understand the problem. This probably means I do not understand what the code is doing with the directory tree.
I also note this issue about platform dependent directories https://github.com/jupyter/jupyter_core/issues/234. Does part of the problem surround trying to account for platform dependent differences in the jupyter_core?
@gutow, as I understand it, the issue is that on a shared system, the order of specificity is: system, user, virtual/conda env. sys_prefix
can be either system or virtual env so we don't know where to prioritize the user setting.
One thing we could do to detect if we are in a virtual/conda env is look for sys.prefix
!= sys.base_prefix
or "CONDA_PREFIX" in os.environ
. That would cover the vast majority of cases, and seems reasonable for a default setting that can be overridden. Either way I think we'd have to bump a major version of jupyter_core
to make the change of default.
sys.prefix
!=sys.base_prefix
or"CONDA_PREFIX" in os.environ
That sounds like a reasonable default for the virtual env solutions we know about. I assume mamba sets the CONDA_PREFIX env variable?
Yes, I only use mamba
now, and I verified. :smile:
One thing we could do to detect if we are in a virtual/conda env is look for
sys.prefix
!=sys.base_prefix
or"CONDA_PREFIX" in os.environ
. That would cover the vast majority of cases, and seems reasonable for a default setting that can be overridden. Either way I think we'd have to bump a major version ofjupyter_core
to make the change of default.
If this works, I think that would provide the behavior most would expect of their virtual environments.
And inside a virtual env:
>>> import sys
>>> sys.prefix
'/private/tmp/foo'
>>> sys.base_prefix
'/Users/steve.silvester/miniconda'
And inside a virtual env:
That looks like a venv inside a conda virtual env :)
"CONDA_PREFIX" in os.environ
.
We probably also want to check that sys.prefix starts with CONDA_PREFIX, since we might be in a conda env without a python interpreter (like an R conda env, etc.).
Oh, interesting, yeah, that makes sense.
I took a preliminary stab at this in https://github.com/jupyter/jupyter_core/pull/286 - anyone feel free to take over it.
I propose that
JUPYTER_PREFER_ENV_PATH=1
be made the default behavior.I cannot imagine a case where someone would set up a virtual environment and not expect the python based software such as Jupyter lab to use what is installed in the virtual environment first. Thus, I believe Jupyter lab should never use versions of extensions, etc outside of the virtual environment in preference to those inside. I can imagine people wanting to set up some utilities that work in all environments. So, can see there might be issues surrounding that. I suggest the following logic:
I encountered an unexpected issue with ipywidgets because of this (see https://github.com/jupyter-widgets/ipywidgets/issues/3559).
The original implementation was discussed in https://github.com/jupyter/jupyter_core/pull/199
Thanks for a great tool.
Jonathan