GoogleContainerTools / distroless

🥑 Language focused docker images, minus the operating system.
Apache License 2.0
19.09k stars 1.17k forks source link

mysqlclient on distroless python image does not work #1295

Closed justmike1 closed 1 year ago

justmike1 commented 1 year ago

Describe the bug A clear and concise description of what the bug is.

ImportError: cannot import name '_mysql' from partially initialized module 'MySQLdb' (most likely due to a circular import) (/usr/local/lib/python3.8/site-packages/MySQLdb/__init__.py)

To Reproduce An app which uses sqlalchemy to connect to mysql db with self.engine_str = f'mysql://{self.username}:{self.password}@{self.host}:{self.port}/{self.database}'

FROM python:3.8-slim-buster AS build

ENV PB_REL="https://github.com/protocolbuffers/protobuf/releases"
ENV VERSION="21.12"
ENV PYTHONUNBUFFERED=true
WORKDIR /protobuf
COPY server/proto .

RUN apt-get update && \
    apt-get install -y --no-install-recommends \
    unzip ca-certificates build-essential default-libmysqlclient-dev curl zip && \
    rm -rf /var/lib/apt/lists/*

RUN curl -LO $PB_REL/download/v${VERSION}/protoc-${VERSION}-linux-x86_64.zip && \
    unzip -o protoc-${VERSION}-linux-x86_64.zip -d /usr/local bin/protoc && \
    unzip -o protoc-${VERSION}-linux-x86_64.zip -d /usr/local include/* && \
    rm -rf protoc-${VERSION}-linux-x86_64.zip

RUN mkdir output/ && \
    for p in $(find . -type f -name "*.proto"); do \
      protoc --python_out=output/ --pyi_out=output/ ${p} ; \
    done

WORKDIR /app
COPY server server
RUN mv /protobuf/output/** server/.
COPY Pipfile* ./

RUN pip install pipenv && \
  pipenv install --verbose --system --deploy --ignore-pipfile

# Final stage
FROM gcr.io/distroless/python3-debian11
COPY --from=build /usr/local/lib/python3.8/site-packages /usr/local/lib/python3.8/site-packages
COPY --from=build /app /app
WORKDIR /app
ENV DYLD_LIBRARY_PATH="/usr/local/mysql/lib:$PATH"
ENV PYTHONPATH=/usr/local/lib/python3.8/site-packages

CMD ["server/app.py"]

Expected behavior

The app runs without mysql import error

Console Output

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/MySQLdb/__init__.py", line 18, in <module>
/app/server
    from . import _mysql
ImportError: cannot import name '_mysql' from partially initialized module 'MySQLdb' (most likely due to a circular import) (/usr/local/lib/python3.8/site-packages/MySQLdb/__init__.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/server/app.py", line 172, in <module>
    config = app_init()
  File "/app/server/app.py", line 35, in app_init
    db_proxy, config = utils.db_proxy_init(config)
  File "/app/server/prop_utils.py", line 199, in db_proxy_init
    db_proxy.start()
  File "/app/server/prop_db_proxy.py", line 81, in start
    self.engine = create_engine(self.engine_str, pool_pre_ping=True, echo=True)
  File "<string>", line 2, in create_engine
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/deprecations.py", line 309, in warned
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/create.py", line 560, in create_engine
    dbapi = dialect_cls.dbapi(**dbapi_args)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 163, in dbapi
    return __import__("MySQLdb")
  File "/usr/local/lib/python3.8/site-packages/MySQLdb/__init__.py", line 24, in <module>
    version_info, _mysql.version_info, _mysql.__file__
NameError: name '_mysql' is not defined

Additional context I tried using images: FROM gcr.io/distroless/python3-debian11 FROM gcr.io/distroless/python3-debian10 FROM gcr.io/distroless/python3 I tried also using deb-extractor hack:

# Extract mysqlclient dependencies
FROM python:3.8-slim-buster AS deb_extractor
RUN cd /tmp && \
    apt-get update && apt-get download \
        libmysqlclient-dev && \
    mkdir /dpkg && \
    for deb in *.deb; do dpkg --extract $deb /dpkg || exit 10; done
loosebazooka commented 1 year ago

Sorry this sort of question is beyond our expertise. If you can tell us what is missing from the experimental python image, perhaps we can help.

justmike1 commented 1 year ago

@loosebazooka From what I have researched, the library of default-libmysqlclient-dev

justmike1 commented 1 year ago

Also, the ENV line:

ENV DYLD_LIBRARY_PATH="/usr/local/mysql/lib:$PATH"

Was a test I have done which didn't help, unrelated to source Dockerfile (removed)

justmike1 commented 1 year ago

so changing the dialect driver from mysql to pymysql did the trick. I wouldn't say this issue is closed, because using the mysql driver is typically better as it's sourced on C, but a compiler needs to be included in the image.

dlorenc commented 1 year ago

Not sure if you want to give a different image a try, but we might be able to help get this working with the images at cgr.dev/chainguard/python - it's a bit easier to get extra C dependencies configured in them.

dlorenc commented 1 year ago

FWIW this works:

$ docker run -u root -it --entrypoint=sh cgr.dev/chainguard/python:latest-dev
$ apk add mariadb-dev cmd:mariadb_config 
$ pip install mysqlclient
# python
Python 3.11.3 (main, Jan  1 1970, 00:00:00) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import MySQLdb
>>> print(MySQLdb.version_info)
(2, 1, 1, 'final', 0)
>>>
justmike1 commented 1 year ago

What is the difference between distroless to chainguard? I am typically always going towards google's distroless but I am not aware of chainguard.

dlorenc commented 1 year ago

The Chainguard images are still OSS and "distroless", just based on a new distro called Wolfi designed for this use case instead of debian. A lot of us were involved in creating this project here originally, and we tried to fix a lot of the issues with these new versions. The source for packages is here: https://github.com/wolfi-dev/os

Some more background: https://www.chainguard.dev/unchained/celebrating-6-years-of-distroless

loosebazooka commented 1 year ago

They're both distroless style images and serve very similar purposes and you're probably okay using either? They're built slightly differently: distroless based off debian, while chainguard images are based on wolfi.dev.

I think you'll find that while distroless is scoped to a pretty limited number of images, wolfi/chaninguard has a wide catalogue of images so you might find the wolfi team more flexible to creating or changing images that match your specific needs.

justmike1 commented 1 year ago

I see now, thank you guys! I set this one as closed. TIL

justmike1 commented 1 year ago

@loosebazooka @dlorenc

1296

Probably different issue from the same cause, I think having the c essentials are mandatory for a python package.