Open hute37 opened 1 year ago
Spurious /usr/local/bin/python
link to system python is in this script
To enable pip package installations that require recompilation (Python.h
), python3-dev
is also required
@cboettig Any thoughts on this?
@eitsupi thanks for the ping and @hute37 thanks for the issue, we do probably need to at least document some of these things better. Additionally, some of these might need fixing or may at least unnecessary. You raise a lot of different issues here, so I'll try and hit on each but we might want to break this out into different threads.
"where is/are the right place(s) to configure environment/PATH variables ?"
Yes, great question, but unfortunately the answer depends on a handful of things, as @hute37 observes above. Most of these choices are not in our control.
Most important of these is if you are configuring environmental variables to be accessed from the RStudio interface or another mechanism (e.g. direct bash or R console from container, not via RStudio, or via the S6 init system). RStudio's R console only gets its environmental variables from R's various .Renviron / Reviron.site and default RStudio settings, not the system environmental variables. We don't control this of course but should probably document it more clearly, along with advice about how to pass environmental variables. Many of the rocker scripts write to $R_HOME/etc/Renviron for this reason.
Because Docker users frequently pass environmental variables via docker --env
or --env-file
, we attempt to pass most of these (with some exceptions like PASSWORD) up transparently to the RStudio R console, as you noted:
R_HOME/etc/Renviron.site is not read by s6 supervisor but is written (!) by /etc/cont-init.d/01_set_env
I'm not sure why it's surprising that s6 supervisor isn't reading Renviron.site -- Renviron.site is meant to be read by R processes.
/etc/bash.bashrc is ignored by the service. I tried to move everything under /etc/profile.d scripts but neither bash.bashrc nor /etc/profile get sourced.
Assuming 'the service' here refers to RStudio R console? If so, yes, RStudio R console uses Renviron files, not bash profiles, for env vars.
/etc/services.d/rstudio/run rewrites environment from '/etc/environment'
yes, the call to rserver respects /etc/environment
. This is probably superfluous, as /etc/environment
is not the recommended way to set environment variables in docker, but does provide a mechanism independent of what gets bubbled up into .Renviron files to configure the rserver call.
The [rsession.sh
] (https://github.com/rocker-org/rocker-versioned2/blob/master/scripts/rsession.sh) I think isn't actively used for anything right now? (I think this was there for running rstudio without root?)
under /usr/local/bin there is a link to system python3:
Right, under vanilla ubuntu system python is never bound to python
, a hold-over from the python-2 days that now feels somewhat quaint. This symlink just puts system python
on the the system path without having to include the 3
, should be harmless, except that as you note it looks like somewhere /usr/local/bin
is being pre-pended to PATH ahead of your poetry links, which doesn't look right. I'm not exactly sure where that happens.
If I understand correctly, your main concern is getting pyenv to use non-system python, yes? That's a rather more focused issue than the more general issue of where/how to set environmental variables (which at least does need more documentation). It's been a while since I've tested the install_pyenv.sh
script in https://github.com/rocker-org/rocker-versioned2/blob/master/scripts/install_pyenv.sh#L35, (most users seem to prefer conda-based mechanism for installing alternate versions of python, which has more built-in support in reticulate already), but I don't see anything in there that should be putting /usr/bin/local before the pyenv paths....
I think that the question related to /usr/local/bin/python
symlink to /usr/bin/python3
can be addressed simply moving the link to /usr/bin
The reason may be historical ...
Ubuntu was one of the latest Linux to complete transition to python3, having many critical components in python2 (apt-get, ...), so the choice was to be explicit on python version used by the applications
Arch was one of the first to upgrade to python3.
Because system components should be placed in /usr/bin
, leaving /usr/local/bin
available for "local" builds, I think that moving the link is the "right" thing.
Note:
python3-dev
is requiredpython3-numpy
could be detected if availableUBUNTU (22.04)
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.2 LTS
Release: 22.04
Codename: jammy
# ls -l /usr/bin/python*
lrwxrwxrwx 1 root root 10 Aug 18 2022 /usr/bin/python3 -> python3.10
-rwxr-xr-x 1 root root 5912936 Mar 10 11:55 /usr/bin/python3.10
lrwxrwxrwx 1 root root 34 Mar 10 11:55 /usr/bin/python3.10-config -> x86_64-linux-gnu-python3.10-config
lrwxrwxrwx 1 root root 17 Aug 18 2022 /usr/bin/python3-config -> python3.10-config
-rwxr-xr-x 1 root root 960 Jan 25 09:29 /usr/bin/python3-futurize
-rwxr-xr-x 1 root root 964 Jan 25 09:29 /usr/bin/python3-pasteurize
ARCH (Manjaro)
# lsb_release -a
LSB Version: n/a
Distributor ID: ManjaroLinux
Description: Manjaro Linux
Release: 22.0.0
Codename: Sikaris
# ls -l /usr/bin/python*
lrwxrwxrwx 1 root root 7 Nov 1 15:18 /usr/bin/python -> python3
lrwxrwxrwx 1 root root 10 Nov 1 15:18 /usr/bin/python3 -> python3.10
-rwxr-xr-x 1 root root 14272 Nov 1 15:18 /usr/bin/python3.10
-rwxr-xr-x 1 root root 3306 Nov 1 15:18 /usr/bin/python3.10-config
lrwxrwxrwx 1 root root 17 Nov 1 15:18 /usr/bin/python3-config -> python3.10-config
-rwxr-xr-x 1 root root 2554 Feb 15 2022 /usr/bin/python-argcomplete-check-easy-install-script
-rwxr-xr-x 1 root root 383 Feb 15 2022 /usr/bin/python-argcomplete-tcsh
lrwxrwxrwx 1 root root 14 Nov 1 15:18 /usr/bin/python-config -> python3-config
lrwxrwxrwx 1 root root 52 Apr 17 2022 /usr/bin/pythontex -> /usr/share/texmf-dist/scripts/pythontex/pythontex.py
(As a somewhat total aside I also relied on just python
recently and discovered that on Ubuntu installing the wonderfully-prosaically-named package python-is-python3
now helps.)
Now It seems to work ...
Starting from ml-verse base image I noticed that nvidia/cuda PATH directories were correctly set.
I didn't found any reference to this paths in /etc
and Renviron*
files.
Checking the original repo: nvidia/container-images/cuda i found the setting in
I've done the same, and it'is working ! ;)
Mandatory reference:
Trying to understand the reason why my settings get lost I found some fact that I didn't know. As a brief recap:
RStudio does not execute R as a subprocess. It uses a "strange" animal: the rsession
binary subprocess that embeds R as a shared library, but that also acts as an RPC server to RStudio interface:
RStudio comes in two flavors: "Open-Source" and "Pro". One of the most important add-on in "Pro" is full support of PAM profiles and user environment initialization (see: PAM Authentication). In particular /etc/environment
is read by PAM login modules, before /etc/profile
(see: What is the difference between /etc/environment and /etc/profile?)
NOTE:
# in /etc/profile.d
eval "$(pyenv init --path)"
eval "$(pyenv virtualenv-init -)"
PATH=$(P=$(echo -n $PATH | awk -v RS=: -v ORS=: '!($0 in a) {a[$0]; print $0}'); echo -n ${P:0:-1})
export PATH
Thanks @hute37 , this is great, glad things are working.
We really ought to add a page documenting the use of python (which could also get into this wild west of python environment managers, (conda, pyenv, pipenv, poetry, etc) and the ml images in rocker on https://rocker-project.org/, and maybe a separate one on environmental variables?
I found this problem when trying to enable
pyenv
+poetry
python support on a keras project based on ml-verse image.I wrote a couple of scripts,
based on standard rocker project versions
The main question is:
Following the examples, I put setting in
The
Renviron
file cannot execute bash code, so i manually expand settings given byeval
sso far, so good ...
running /bin/bash interactively in container triggers
/etc/bash.bashrc
eval
s, while running R repl can find pyenv, poetry in path:In this setting, the right version is selected via pyenv (
.python-version
project file)In rstudio-server session instead, I cannot configure correctly.
There is a mix of conflicting setting ...
/etc/bash.bashrc is ignored by the service. I tried to move everything under
/etc/profile.d
scripts but neither bash.bashrc nor /etc/profile get sourced.$R_HOME/etc/Renviron.site is not read by s6 supervisor but is written (!) by /etc/cont-init.d/01_set_env
/etc/services.d/rstudio/run
rewrites environment from '/etc/environment'there is also a rsession.sh that could be used un starting internal R sessions
Then:
/usr/local/bin
there is a link to system python3:In rstudio R console, i get:
in rstudio terminal, i get
Which is wrong because the paths
/usr/local/bin:/usr/lib/rstudio-server/resources/terminal/bash/.local/bin
were prepended to system path and a spurious/usr/local/bin/python
override pyenv version, under/opt/pyenv/shims
Running /bin/bash in container
Running R in container
Maybe what is missing here i a right initialization for s6 supervisor/rstudio-server configuration ...