slimtoolkit / slim

Slim(toolkit): Don't change anything in your container image and minify it by up to 30x (and for compiled languages even more) making it secure too! (free and open source)
Apache License 2.0
19.21k stars 716 forks source link

Slimming Python file result in a ModuleNotFoundError or ImportError #436

Closed OscarIntellico closed 1 year ago

OscarIntellico commented 1 year ago

Description

I'm trying to slim a docker container with some python dependencies ( in particular, pandas, numpy, scikit-learn). The original container runs fine without any issue. After running docker-slim build I get a much smaller container, but I think that the slimming process has been too aggressive. The error is also not consistent after different builds.

Source Code

Dockerfile

`FROM python:3.9-slim

WORKDIR /app COPY . /app RUN python -m pip install numpy scikit-learn pandas

CMD ["python3", "problem.py"]`

problem.py

import numpy as np; import pandas as pd; df = pd.DataFrame(); print(df)

Expected Behavior

docker-slim should create a slimmed version of the original container. The output of docker run problem.slim should be:

Empty DataFrame Columns: [] Index: []


Actual Behavior

On different runs, different problems arised.

RUN 1: Traceback (most recent call last): File "/app/aaaa.py", line 2, in <module> import pandas as pd; File "/usr/local/lib/python3.9/site-packages/pandas/__init__.py", line 138, in <module> from pandas import testing # noqa:PDF015 File "/usr/local/lib/python3.9/site-packages/pandas/testing.py", line 6, in <module> from pandas._testing import ( File "/usr/local/lib/python3.9/site-packages/pandas/_testing/__init__.py", line 69, in <module> from pandas._testing.asserters import ( File "/usr/local/lib/python3.9/site-packages/pandas/_testing/asserters.py", line 17, in <module> import pandas._libs.testing as _testing ModuleNotFoundError: No module named 'pandas._libs.testing'

RUN 2: The output was correct

RUN 3: Traceback (most recent call last): File "/app/aaaa.py", line 2, in <module> import pandas as pd; File "/usr/local/lib/python3.9/site-packages/pandas/__init__.py", line 22, in <module> from pandas.compat import is_numpy_dev as _is_numpy_dev # pyright: ignore # noqa:F401 File "/usr/local/lib/python3.9/site-packages/pandas/compat/__init__.py", line 18, in <module> from pandas.compat.numpy import ( File "/usr/local/lib/python3.9/site-packages/pandas/compat/numpy/__init__.py", line 4, in <module> from pandas.util.version import Version File "/usr/local/lib/python3.9/site-packages/pandas/util/__init__.py", line 2, in <module> from pandas.util._decorators import ( # noqa:F401 ModuleNotFoundError: No module named 'pandas.util._decorators'


Steps to Reproduce the Problem

  1. docker build . -t problem
  2. docker-slim build --http-probe=false problem
  3. docker run problem.slim

Specifications

OscarIntellico commented 1 year ago

I was able to solve it myself, I had to write ENTRYPOINT ["python3", "problem.py"] instead of the CMD line.

kcq commented 1 year ago

I was able to solve it myself, I had to write ENTRYPOINT ["python3", "problem.py"] instead of the CMD line.

@OscarIntellico it should work with the CMD instruction too. It's definitely a bug and I'll investigate

kcq commented 1 year ago

the latest version should work with your original Dockerfile (will be good to retest once the newest version is out)