lambci / docker-lambda

Docker images and test runners that replicate the live AWS Lambda environment
MIT License
5.83k stars 431 forks source link

Missing dependency chardet when using pipenv install on latest build-python3.7 image tag #271

Closed samshelley closed 4 years ago

samshelley commented 4 years ago

Hi @mhart, thanks so much for maintaining these dockerfiles!

We started running into an issue yesterday that corresponds roughly with when the latest tags for each build environment were posted. Basically, the chardet (https://pypi.org/project/chardet/) dependency which is relied on by many python libraries suddenly started vanishing from our build step. All of the other dependencies are present in site-packages, just not that one. Reverting back to an older sha (ie. FROM lambci/lambda@sha256:4d7db8d1724a0554574c506c8e5bcaa810718c91ccd9394d3e2456365108d56d) fixes the issue.

Here's a minimal repro: Dockerfile:

FROM lambci/lambda:build-python3.7

COPY Pipfile Pipfile.lock ./

RUN pipenv install --deploy

RUN cd $(pipenv --venv)/lib/python3.7/site-packages \
    && ls \
    && zip -qr9 /var/task/function.zip .

CMD echo done

Pipfile:

[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true

[dev-packages]

[packages]
chardet = "*"

[requires]
python_version = "3.7"

Pipfile.lock

{
    "_meta": {
        "hash": {
            "sha256": "2a6cc4b02dd471cb1222b585553df91a14b96b0d50a569e011e1733cb1115fed"
        },
        "pipfile-spec": 6,
        "requires": {
            "python_version": "3.7"
        },
        "sources": [
            {
                "name": "pypi",
                "url": "https://pypi.org/simple",
                "verify_ssl": true
            }
        ]
    },
    "default": {
        "chardet": {
            "hashes": [
                "sha256:84ab92ed1c4d4f16916e05906b6b75a6c0fb5db821cc65e70cbd64a3e2a5eaae",
                "sha256:fc323ffcaeaed0e0a02bf4d117757b98aed530d9ed4531e3e15460124c106691"
            ],
            "index": "pypi",
            "version": "==3.0.4"
        }
    },
    "develop": {}
}

Then build it docker build . < Dockerfile and you will see the ls step does not print out the chardet dependency in the list of site packages.

If you swap out the FROM to be FROM lambci/lambda@sha256:4d7db8d1724a0554574c506c8e5bcaa810718c91ccd9394d3e2456365108d56d (a version one of us had from a few months ago) it fixes the problem.

We're still investigating to see if something else might be going on on our side, but any ideas on what might have changed recently would be immensely helpful.

mhart commented 4 years ago

Hi @samshelley – I'm sorry that a change like this broke your build process. I'm not sure entirely where it would have come from though.

I don't explicitly install chardet, so the only thing I can think is that an update to another Python dep has dropped chardet as a dependency?

For example, one of these dependencies like aws-sam-cli may have had a chardet dependency, but no longer does? https://github.com/lambci/docker-lambda/blob/master/python3.8/build/Dockerfile#L13-L16

In general I guess it's hard to avoid these problems if deps drop sub-deps over time. Because these build images don't guarantee what libraries they have on them besides the ones explicitly installed, the best thing to do would be to install these deps yourself.

mhart commented 4 years ago

(sorry, that's a link to the python3.8 image, here's the python3.7 build image: https://github.com/lambci/docker-lambda/blob/master/python3.7/build/Dockerfile#L13-L16)

samshelley commented 4 years ago

Hi @mhart no problem at all! Thanks for the insanely fast response.

So specifically chardet is OUR dependency in our Pipfile. We definitely aren't relying on you keeping it as another dep. What's so incredibly odd is that despite being explicitly listed in our file, installing it does not result in the file being present in newer versions of your docker file -- it just vanishes.

DockerHub makes it super hard to do a bisection and figure out which set of changes might have caused it -- do you know when your last release was before the one from ~yesterday? (or maybe you know how to find that info on DockerHub). We were trying to give you the exact changeset that caused it, but haven't managed to figure that out yet.

samshelley commented 4 years ago

Our first thought was maybe it was something to do with a new version of pipenv FWIW. Is versioning up aws-sam-cli the only thing you've done recently?

mhart commented 4 years ago

Hmmm. I'm starting to wonder if something's up with wheel – it's no longer being installed. I'm not entirely sure why, but I'm going to add it in explicitly – just about to push up new images now.

mhart commented 4 years ago

Please try again now and let me know how you go

mhart commented 4 years ago

Sorry, I mean... now

samshelley commented 4 years ago

Sure -- using my minimal repro still not seeing it as a dependency.

Here's my output:

Sending build context to Docker daemon  6.656kB
Step 1/5 : FROM lambci/lambda:build-python3.7
 ---> 70e62da7e44a
Step 2/5 : COPY Pipfile Pipfile.lock ./
 ---> 138578c09453
Step 3/5 : RUN pipenv install --deploy
 ---> Running in b03c2f1a0e50
Creating a virtualenv for this project…
Pipfile: /var/task/Pipfile
Using /var/lang/bin/python3.7 (3.7.7) to create virtualenv…
⠇ Creating virtual environment...created virtual environment CPython3.7.7.final.0-64 in 539ms
  creator CPython3Posix(dest=/root/.local/share/virtualenvs/task-rlWbeMzF, clear=False, global=False)
  seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, via=copy, app_data_dir=/root/.local/share/virtualenv/seed-app-data/v1.0.1)
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator

✔ Successfully created virtual environment! 
Virtualenv location: /root/.local/share/virtualenvs/task-rlWbeMzF
Installing dependencies from Pipfile.lock (115fed)…
Removing intermediate container b03c2f1a0e50
 ---> cae751a0759c
Step 4/5 : RUN cd $(pipenv --venv)/lib/python3.7/site-packages     && ls     && zip -qr9 /var/task/function.zip .
 ---> Running in f6861e7bac73
easy_install.py
pip
pip-20.1.dist-info
pip-20.1.virtualenv
pkg_resources
setuptools
setuptools-46.1.3.dist-info
setuptools-46.1.3.virtualenv
_virtualenv.pth
_virtualenv.py
wheel
wheel-0.34.2.dist-info
wheel-0.34.2.virtualenv
Removing intermediate container f6861e7bac73
 ---> 69a89cee07ba
Step 5/5 : CMD echo done
 ---> Running in 38ba7b2848f2
Removing intermediate container 38ba7b2848f2
 ---> e1e49bf0b2c2
Successfully built e1e49bf0b2c2
samshelley commented 4 years ago

Here it is working using the explicit sha256 tag:

Sending build context to Docker daemon  6.656kB
Step 1/5 : FROM lambci/lambda@sha256:4d7db8d1724a0554574c506c8e5bcaa810718c91ccd9394d3e2456365108d56d
 ---> 86d41bc2f471
Step 2/5 : COPY Pipfile Pipfile.lock ./
 ---> 5ef16c645bf5
Step 3/5 : RUN pipenv install --deploy
 ---> Running in cd89c5f86d89
Creating a virtualenv for this project…
Pipfile: /var/task/Pipfile
Using /var/lang/bin/python3.7 (3.7.5) to create virtualenv…
⠋ Creating virtual environment...Already using interpreter /var/lang/bin/python3.7
Using base prefix '/var/lang'
New python executable in /root/.local/share/virtualenvs/task-rlWbeMzF/bin/python3.7
Also creating executable in /root/.local/share/virtualenvs/task-rlWbeMzF/bin/python
Installing setuptools, pip, wheel...
done.

✔ Successfully created virtual environment! 
Virtualenv location: /root/.local/share/virtualenvs/task-rlWbeMzF
Installing dependencies from Pipfile.lock (115fed)…
Removing intermediate container cd89c5f86d89
 ---> ec2a2ad39df1
Step 4/5 : RUN cd $(pipenv --venv)/lib/python3.7/site-packages     && ls     && zip -qr9 /var/task/function.zip .
 ---> Running in 36096cc42741
chardet
chardet-3.0.4.dist-info
easy_install.py
pip
pip-20.1.dist-info
pkg_resources
__pycache__
setuptools
setuptools-46.1.3.dist-info
wheel
wheel-0.34.2.dist-info
Removing intermediate container 36096cc42741
 ---> bb18d75672c5
Step 5/5 : CMD echo done
 ---> Running in d669da494882
Removing intermediate container d669da494882
 ---> a9b5f6a6feb8
Successfully built a9b5f6a6feb8
mhart commented 4 years ago

Have you pulled the latest image? I see it if I do pip list (I mean, I see wheel – which wasn't installed before)

mhart commented 4 years ago

I also see chardet fwiw

samshelley commented 4 years ago

Hi @mhart -- yes I pulled the latest and you are absolutely correct that running pip list inside the container (RUN $(pipenv --venv)/bin/pip list) says it's installed:

Package    Version
---------- ----------
certifi    2019.11.28
chardet    3.0.4
docutils   0.15.2
idna       2.9
jmespath   0.9.5
pip        20.1
s3transfer 0.3.3
setuptools 46.1.3
six        1.14.0
urllib3    1.25.8
wheel      0.34.2

What's insanely weird is that it is nowhere to be found in the actual site-packages though.

That's why my next thought is that it was a pipenv bug. If you scroll to the bottom of https://github.com/pypa/pipenv/issues/3801 you can see that if you generate a requirements.txt and install from that it seems to fix the issue for something that seems somewhat related. FWIW we actually tried doing this for our build and it fixed it.

I wonder if it's something to do with python 3.7.7 & pipenv. (that's the only thing I can think of) Did you update to python 3.7.7 recently?

mhart commented 4 years ago

I don't update the live Lambda runtimes – AWS does – so these images would've had python 3.7.7 sometime after AWS would have changed it in Lambda. There's no explicit commit for this, because no code actually changes in this repo.

samshelley commented 4 years ago

Okay that makes complete sense -- sounds like it might be that. Any chance you have a history of the digests you've published by tag so I can try to diagnose further?

Otherwise I do have two potentially working solutions at this point (pin the sha256 or switch to using requirements.txt)

pipenv also looks to be getting updated soon as well so that might also potentially fix the issue

Either way we can close this out since it's almost certainly upstream. Thanks!

mhart commented 4 years ago

I don't have any digest history unfortunately – I only do explicit tags if there are (known) breaking changes. Will close this out, but let us know here if you solve your issues

mhart commented 4 years ago

I believe this should now be fixed with https://github.com/lambci/docker-lambda/commit/e69aa8b1f5ed85938e2be61db0ed9f574e17dd58 – I'm just unsetting PYTHONPATH. See https://github.com/lambci/docker-lambda/issues/272#issuecomment-628966591 if you want to revert to the older images that had this env var set.

samshelley commented 4 years ago

Got it — read through the other issue and it seems like that is also us

We are still pinned to the last working sha for now but I’ll switch it to the latest next week and confirm the fix.

Thanks!