rocker-org / rocker-versioned2

Run current & prior versions of R using docker. rocker/r-ver, rocker/rstudio, rocker/shiny, rocker/tidyverse, and so on.
https://rocker-project.org
GNU General Public License v2.0
417 stars 181 forks source link

Why does the install_python.sh install script not provide `python` command? #860

Closed Robinlovelace closed 1 month ago

Robinlovelace commented 1 month ago

Container image name

rocker/geospatial

Container image digest

No response

What operating system related to this question?

Linux

System information

No response

Question

I tried installing Python inside a Rocker container but found that the command

python

failed. See the test Dockerfile here: https://github.com/geocompx/docker/blob/8bdde78a1d9efcd66f194f37de74a5d5ed686649/rocker-r-py-julia/Dockerfile

And the cross-language project we're looking to support with an Docker container: https://github.com/Robinlovelace/cross_language_projects

eddelbuettel commented 1 month ago

Can you please try and see if python3 works? It is one of those backwards-compatibility things: until you install package python-is-python3 the link to python may not exist.

Robinlovelace commented 1 month ago

Can you please try and see if python3 works? It is one of those backwards-compatibility things: until you install package python-is-python3 the link to python may not exist.

Yes python3 works.

eddelbuettel commented 1 month ago

So we have a possible bug report here that we should also install python-is-python3 to cover this wart.

eitsupi commented 1 month ago

Perhaps the question has been answered, so may I close this?

I think users can install them if they need to. There is no end to the debate about Python version selection, so I am skeptical of the idea that we should install anything other than the minimum required from the start.

If you really need it, I recommend that you install the version of your choice with uv at the time of your choice. (Yes, of course Julia can be installed with juliaup. Likewise, R can be installed with rig.)

eddelbuettel commented 1 month ago

You are the maintainer, it is your call what you install.

I have been bitten myself in the past by wanting python for shebang or other scripts, and install python-is-python3 fixes that particular Debian / Ubuntu issue at an installed cost of 15kb, ie nothing (on my 24.04, per dpkg -s).

eitsupi commented 1 month ago

There seems to be no particular comment, so I'll close it.

cboettig commented 1 month ago

I think these days python pushes us to create a venv for all installs, which install_python.sh already does, and places that on the PATH. Creating a venv creates a /opt/venv/bin directory, which includes the symlinks for all this (i.e. python is python3.10 is python3 is python, and likewise the bindings for pip is pip3, etc). This helps ensure users install python packages in the venv and don't break system packages.

We can confirm:

docker run --rm -ti rocker/binder which python
/opt/venv/bin/python

so python does exist and importantly is the venv python (which is a symlink to the system python3).

@Robinlovelace if you're not seeing this, it sounds like there may be some kind of PATH issue that also means you aren't accessing the venv but the /usr/python, which isn't ideal.... But I think we need more context to debug.

cboettig commented 1 month ago

Also more explicitly on the symlinks:

docker run --rm -ti rocker/binder ls -l /opt/venv/bin | grep python
-rwxr-xr-x 1 root staff  229 Sep 24 14:25 ipython
-rwxr-xr-x 1 root staff  229 Sep 24 14:25 ipython3
lrwxrwxrwx 1 root staff    7 Sep 24 14:25 python -> python3
lrwxrwxrwx 1 root staff   16 Sep 24 14:25 python3 -> /usr/bin/python3
lrwxrwxrwx 1 root staff    7 Sep 24 14:25 python3.10 -> python3

(aside but also note the staff permissions, a user in the staff group can write to the venv)

Robinlovelace commented 1 month ago

Thanks @cboettig that's helpful, I went with

RUN apt-get update && apt-get install -y python-is-python3

But wondering if that's necessary now... One thing I think could be worth doing in yours is something like this:

# Set env variable so reticulat uses system installation of python:
ENV RETICULATE_PYTHON=/usr/bin/python3

Source: https://github.com/geocompx/docker/blob/master/rocker-rpy/Dockerfile

cboettig commented 1 month ago

Thanks @Robinlovelace and apologies I didn't explain well, it's not just a symlink question.

First, it should not be necessary to set reticulate specific env vars, reticulate respects the VIRTUAL_ENV variable we are already setting:

docker run --rm -ti rocker/binder R -e 'reticulate::py_config()'
reticulate::py_config()
python:         /opt/venv/bin/python
libpython:      /usr/lib/python3.10/config-3.10-x86_64-linux-gnu/libpython3.10.so
pythonhome:     /opt/venv:/opt/venv
version:        3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]
numpy:           [NOT FOUND]

NOTE: Python version was forced by VIRTUAL_ENV

Note that NOTE: Python version was forced by VIRTUAL_ENV, and that we are pointing to /opt/venv/bin/python

Setting RETICULATE_PYTHON is not only unnecessary, but undesirable -- this makes it harder for end users to have multiple virtual environments without overriding this explicit env var which they probably don't even know exists. (for instance, can create issues with renv+python if it isn't handling this). It also means that python outside of R can be a different venv than inside R, creating another source of confusion (especially if users install additional packages on the terminal rather than via reticulate, which many do because that's what most install directions say to do).

On the symlink thing, this isn't just a matter of symlinks or aliases. Note that:

docker run --rm -ti rocker/binder python3 -m pip install xarray

is NOT the same as

docker run --rm -ti rocker/binder /usr/bin/python3 -m pip install xarray

In the former, python3 is in the venv /opt/venv and it installs packages there. In the latter, python3 is the system python with no venv. Luckily we are not root so it creates a home dir and installs there, though this is not desirable for docker builds (larger and home dirs can be overwritten in some setups by bind mounts, including standard practice with jupyter). This issue gets worse if we are root:

docker run --rm -ti --user root rocker/binder /usr/bin/python3 -m pip install xarray

Now you overwrite system packages and get his warning:

WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

Long story short, we should be using the venv, and by default should be using the default venv. It sounds like this is not working for you as it should, and maybe we need to dig deeper?

Robinlovelace commented 1 month ago

Thanks for the detailed response Carl, will aim to have a look and follow-up, may in a new issue on our issue tracker. Agreed, from what you said it seems that the things you do in install_python.sh should work well for our use case.