GoogleContainerTools / distroless

🥑 Language focused docker images, minus the operating system.
Apache License 2.0
18.91k stars 1.16k forks source link

`sh` entrypoint is present on production images? #601

Open kc-fiddler opened 4 years ago

kc-fiddler commented 4 years ago

I've just tried one the examples (https://github.com/GoogleContainerTools/distroless/tree/master/examples/python3-requirements) and I see that I'm able to docker run --entrypoint=sh -ti hello:latest into it and start a python shell. Is that expected? This is not a :debug image.

dockerfile:

 % cat /tmp/hello/dockerfile 
FROM debian:buster-slim AS build
RUN apt-get update && \
    apt-get install --no-install-suggests --no-install-recommends --yes python3-venv gcc libpython3-dev && \
    python3 -m venv /venv && \
    /venv/bin/pip install --upgrade pip

# Build the virtualenv as a separate step: Only re-execute this step when requirements.txt changes
FROM build AS build-venv
COPY requirements.txt /requirements.txt
RUN /venv/bin/pip install --disable-pip-version-check -r /requirements.txt

# Copy the virtualenv into a distroless image
FROM gcr.io/distroless/python3-debian10
COPY --from=build-venv /venv /venv
COPY . /app
WORKDIR /app
ENTRYPOINT ["/venv/bin/python3", "hello.py"]

output of docker run:

% docker run --entrypoint=sh -ti hello:latest
# python
Python 3.7.3 (default, Jul 25 2020, 13:03:44) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> print(os.listdir())
['hello.py', 'dockerfile', 'requirements.txt']
>>> 

the provenance of the docker image using distroless python base image

IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
f7a84cf0388b        6 hours ago         /bin/sh -c #(nop)  ENTRYPOINT ["/venv/bin/py…   0B                  
5f2bca8581ac        6 hours ago         /bin/sh -c #(nop) WORKDIR /app                  0B                  
f60da75481e4        6 hours ago         /bin/sh -c #(nop) COPY dir:009514344de586000…   727B                
35a1cbf65adc        6 hours ago         /bin/sh -c #(nop) COPY dir:07fe1f6c56a9bdfd3…   15.3MB              
cd87a7fd48f2        50 years ago        bazel build ...                                 31MB                
<missing>           50 years ago        bazel build ...                                 1.97MB              
<missing>           50 years ago        bazel build ...                                 17.4MB              
<missing>           50 years ago        bazel build ...                                 1.8MB  
gcr.io/distroless/python3-debian10                              latest              cd87a7fd48f2        50 years ago        52.2MB
kc-fiddler commented 4 years ago

the same is true for the base image as well:

% docker run --entrypoint=sh -ti gcr.io/distroless/python3-debian10:latest
# python
Python 3.7.3 (default, Jul 25 2020, 13:03:44) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> print(os.listdir())
['root', 'sys', 'etc', 'proc', 'var', 'tmp', 'dev', 'home', 'bin', 'lib', 'sbin', 'run', 'usr', 'boot', '.dockerenv', 'lib64']
>>> 
kc-fiddler commented 4 years ago
% docker run --entrypoint=sh -ti gcr.io/distroless/base-debian10:latest
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"sh\": executable file not found in $PATH": unknown.
ERRO[0000] error waiting for container: context canceled 

the base image gcr.io/distroless/base-debian10:latest does not allow the sh entrypoint.. which I assume is the right behavior?

chanseokoh commented 4 years ago

dash was added in #229 to resolve #150. That was when using Python 2. I have no expertise in Python, so I'm not sure if Python 3 still requires a shell. Would you be willing to take a look at #150 and check if we still need it? @evanj what do you think?

kc-fiddler commented 4 years ago

Interesting.. Adding dash to the python images seems like it is needed for some python code (os.system etc).. However, distroless's definition per https://github.com/GoogleContainerTools/distroless is:

"Distroless" images contain only your application and its runtime dependencies. They do not contain package managers, shells or any other programs you would expect to find in a standard Linux distribution.

So, IMO, the python images are not truly distroless and arguably they give a false sense of security as one can still drop/exec into the shell and use interactive-python to do whatever they want..

If the aim of distroless is security, then having a shell in the production python images is a no-no.. That's my opinion on this.

chanseokoh commented 4 years ago

Not arguing your main point, but just to let you know a few things:

use interactive-python to do whatever they want.

Anyone can start an interactive-python to do whatever they want regardless of the presence of dash. For example, let's say I prepare a custom Python Distroless image that removed /bin/dash.

FROM gcr.io/distroless/python3-debian10
RUN ["python", "-c", "import os; os.remove('/bin/dash')"]
~/tmp/a$ docker build -t test .
~/tmp/a$ # verify it doesn't have dash
~/tmp/a$ docker run --rm -it --entrypoint sh test
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"sh\": executable file not found in $PATH": unknown.
~/tmp/a$ # anyone can start an interactive Python
~/tmp/a$ docker run --rm -it --entrypoint python test
Python 3.7.3 (default, Jul 25 2020, 13:03:44) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os;
>>> print(os.listdir())
['sbin', 'etc', 'bin', 'run', 'proc', 'home', 'boot', 'var', 'usr', 'dev', 'root', 'tmp', 'lib', 'sys', '.dockerenv', 'lib64']
>>> 

And as a current workaround, you can remove /bin/dash from gcr.io/distroless/python3-debian10 in the way shown above.

kc-fiddler commented 4 years ago

@chanseokoh good point there.. maybe, disabling python interactive mode (I don't know if that's possible) is also something that can be considered a workaround.. Let me investigate that a little bit..

chanseokoh commented 4 years ago

I don't think disabling the interactive mode will make a real difference. Anyone can still do python -c "# any python statements, such as import os; os.remove('/foo')". Basically you can run any arbitrary Python script.

kc-fiddler commented 4 years ago

d'oh! python makes this extremely fun.. :)

at this point, making this work, without a shell like interface like with the other images is probably not simple..

evanj commented 4 years ago

This issue really made me sad, because I don't like the shell being there either. However, I needed this to get gunicorn to work, which is a pretty widely used web server framework. IMO: Distroless python's image should have a standard library that works the same as the Debian Python package it is based on.

I looked quickly at Python 3's ctypes.util module, and it looks like they may have "fixed" it, and this might not be needed for Python 3! I'm pretty sure I added some tests for this, so I might be able to find some time to try it to see if it is possible to remove dash for Python 3.

This would break os.system("echo command"), since it is defined as calling the shell, but maybe that is acceptable? I would like to believe that nothing has such a bad security hole in it, but I also know that there are 100% production applications that do this, out there somewhere.

zoidyzoidzoid commented 3 years ago

Would it be worth patching os.system with a version that works more like go's os/exec?

evanj commented 3 years ago

This would be an incompatibility with standard Python. It probably would not affect most programs, but I am sure there are programs and libraries out there that depend on this standard library function to execute a shell.

Personally: My vote would be to ensure that distroless's "default" Python image can run any Python program the same way as the upstream version. We could also add a python-noshell image that would remove the shell. This would allow users to opt-in to an incompatible Python version, if they want, while still allowing a distroless image to run Python programs without compatibility issues.

kc-fiddler commented 3 years ago

There's the shell and then there's the python interactive shell. Patching os.system will do away with the former, but, one would still be able to exec into the python interactive shell and do whatever damage they'd want.

raphendyr commented 2 years ago

I do think that distroless environment should not include os.system as it's not supposed to be full POSIX environment. Hence, I don't see reason to support that. If someone was using that to start subprocess in distroless image, then they should rework the code to use subprocess, os.posix_spawn or os.exec variants depending of the needs.

I don't see why subprocess execution would be a matter for this issue in any other extend than os.system. How I see it, all VM based runtimes imply that if you are able to provide the code (upload or feed as input) and execute the runtime, then you will get your code evaluated. Python makes this a bit easier compared to runtimes with separate compiler and executor, but even there, if you can provide the bytecode and call the runtime, then you will get the same result.

loosebazooka commented 1 year ago

@evanj looks like we're going to try to make a python image without dash (while also keeping an experimental python image with dash). Curious if the ld.so..cache are still required for python framework to continue to work?

evanj commented 1 year ago

Good question. I haven't followed python3 closely in a while, so this could have changed. There are two things that removing dash will break:

  1. Anything that calls os.system() or subprocess.Popen() with a string argument. In both cases, Python executes the string with the shell, which will fail without dash.

  2. The ld.so stuff was necessary for ctypes.util.find_library('c') , which is used to call "raw" C library functions. The Python implementation of this function still runs "ldconfig -p", which reads this cache. I haven't tested it recently. This code does have a fallback to running "ld -t", so maybe installing "ld" in the image from the Debian binutils package would be a way to make this work with less effort. It would require testing

Python implementation: https://github.com/python/cpython/blob/main/Lib/ctypes/util.py#L289

There are third-party libraries that use this. I personally ran into this with gunicorn, which appears to have removed its usage: https://github.com/benoitc/gunicorn/pull/2254 ; The original Distroless change with link to a bug with a number of people reporting this problem: https://github.com/GoogleContainerTools/distroless/pull/228

Additionally there was a "monotonic" package which was imported for some period of time by the Google Cloud APIs (transitively, via a "tenacity" third party API, then "monotonic"). However, that package is now deprecated since Python 3.3 included a monotonic time API.

I did a quick github search and found a few places that do still use this. Two examples:

https://github.com/saltstack/salt/blob/fb717a8d4bea3d48c596ae8dda313b7639c23dc6/salt/auth/pam.py#L62 https://github.com/andnovar/kivy/blob/eccfb418b25cd2afb87ed8a130295658a2a1a0af/kivy/clock.py#L247

My conclusion: This will "break" some things. On the other hand, the things it breaks should be pretty rare, and keeping this working is painful. Since this package is marked experimental, it seems reasonable to me to try removing both of them.

Evan

On Mon, Sep 25, 2023 at 11:06 AM Appu @.***> wrote:

@evanj https://github.com/evanj looks like we're going to try to make a python image without dash (while also keeping an experimental python image with dash). Curious if the ld.so..cache are still required for python framework to continue to work?

— Reply to this email directly, view it on GitHub https://github.com/GoogleContainerTools/distroless/issues/601#issuecomment-1733910946, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFE2TZELITOONB2O2WXFL3X4GMW5ANCNFSM4R6WYJZA . You are receiving this because you were mentioned.Message ID: @.***>

-- Evan Jones https://www.evanjones.ca/