aws / aws-lambda-base-images

Apache License 2.0
648 stars 107 forks source link

Python runtime dependencies override user-provided libraries #8

Closed mpszumowski closed 12 months ago

mpszumowski commented 3 years ago

Image used: amazon/aws-lambda-python:3.7

Dockerfile:

FROM amazon/aws-lambda-python:3.7

RUN yum -y install gcc

COPY ./requirements.txt ./requirements.txt
RUN pip3 install -r requirements.txt
COPY ./lambda_function.py ./lambda_function.py

RUN python -c 'import sys; print(sys.path)'
RUN pip3 freeze | grep idna

CMD [ "lambda_function.lambda_handler" ]

requirements.txt

snowflake-connector-python==2.3.10
snowflake-sqlalchemy==1.2.4

lambda_function.py

import pkg_resources
import sys

from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine

def lambda_handler(event, context):

    print(sys.path)
    print(pkg_resources.working_set.by_key['idna'])
    engine = create_engine(
        URL(account='account', user='user',
            database='"DATABASE"', warehouse='warehouse')
    )
    return {'success': 200}

What I would expect: Dependencies are installed correctly, the lambda_handler imports them and executes properly on Lambda.

What is the case: Log from Lambda:

[ERROR] ContextualVersionConflict: (idna 3.1 (/var/runtime), Requirement.parse('idna<3,>=2.5'), {'requests', 'snowflake-connector-python'})
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 17, in lambda_handler
    database='"DATABASE"', warehouse='warehouse')
  File "/var/lang/lib/python3.7/site-packages/sqlalchemy/engine/__init__.py", line 520, in create_engine
    return strategy.create(*args, **kwargs)
  File "/var/lang/lib/python3.7/site-packages/sqlalchemy/engine/strategies.py", line 61, in create
    entrypoint = u._get_entrypoint()
  File "/var/lang/lib/python3.7/site-packages/sqlalchemy/engine/url.py", line 172, in _get_entrypoint
    cls = registry.load(name)
  File "/var/lang/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 275, in load
    return impl.load()
  File "/var/lang/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2461, in load
    self.require(*args, **kwargs)
  File "/var/lang/lib/python3.7/site-packages/pkg_resources/__init__.py", line 2484, in require
    items = working_set.resolve(reqs, env, installer, extras=self.extras)
  File "/var/lang/lib/python3.7/site-packages/pkg_resources/__init__.py", line 792, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)

The snowflake-connector-python imports a different version of its dependency than pip has installed during Docker build. It then fails due to the fact that idna 3.1 library does not match its requirements: idna<3,>=2.5.

Why it tries to import a different version is suggested by the logging I have added in the Dockerfile and the lambda_function.

Dockerfile: RUN python -c 'import sys; print(sys.path)'

---> Running in 5c925a9e311a
['', '/var/lang/lib/python37.zip', '/var/lang/lib/python3.7', '/var/lang/lib/python3.7/lib-dynload', '/var/lang/lib/python3.7/site-packages']

lambda_function.lambda_handler: print(sys.path)

['/var/task', '/opt/python/lib/python3.7/site-packages', '/opt/python', '/var/runtime', '/var/lang/lib/python37.zip', '/var/lang/lib/python3.7', '/var/lang/lib/python3.7/lib-dynload', '/var/lang/lib/python3.7/site-packages', '/opt/python/lib/python3.7/site-packages', '/opt/python']

Dockerfile: RUN pip3 freeze | grep idna' idna==2.10

lambda_function.lambda_handler: print(pkg_resources.working_set.by_key['idna']) idna 3.1

What happens is that Lambda Runtime sets the /var/runtime directory in front of /var/lang/lib/python3.7 and populates the pkg_resources.WorkingSet with the distributions installed there (mostly boto3 + deps). This is being carried over to the lambda handler which is executed with the "overriden" libraries. Seeing how sys.path at the moment when the handler executes, I presume that it has been manually modified to not place the runtime path at the beginning, but the user provided libraries in /var/task. Why /opt/python/lib/python3.7/site-packages is the second if it is not the pip site-packages directory?

The outcome is really confusing - using Docker I expect to be able to handle my runtime and (at least) my dependencies. I definitely expect the Lambda Runtime to be transparent and its dependencies not to impact my workload. Especially if the bug is this opaque and undocumented.

I was able to work my way around with the following hack.

import pkg_resources
import sys

from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine

def lambda_handler(event, context):

    entry = '/var/lang/lib/python3.7/site-packages'
    sys.path = [entry] + sys.path
    for dist in pkg_resources.find_distributions(entry, True):
        pkg_resources.working_set.add(dist, entry, False, replace=True)

    [...]

It may be dangerous if the manually imported libraries libraries will in turn conflict with the downstream code in the runtime. I think, however, that something of this kind can be implemented in the runtime itself so the the handler use only the environment libraries.

denis-ryzhkov commented 2 years ago

The same workaround, just moved to top-level, compatible with running locally, and creating no duplicate entries in sys.path:

import pkg_resources
import sys

site_packages = "/var/lang/lib/python3.8/site-packages"
try:
    sys.path.remove(site_packages)
except ValueError:
    pass
else:
    sys.path.insert(0, site_packages)
    for dist in pkg_resources.find_distributions(site_packages, True):
        pkg_resources.working_set.add(dist, site_packages, False, replace=True)
ilias-at-adarma commented 1 year ago

This behavior is terrible, I am surprised this hasn't been fixed. If you install your own boto3/botocore3 and any other library that is shadowed by /var/runtime you will think you are running on your pinned requirement but nope, you completely rely on the shadowed version. Production ticking bomb. Even worse if you haven't pinned on a dated image tag and your image is cached; your dependencies slowly grow out of date without you knowing.

There is also not much control over this, removing syspath being a rather hacky solution in my view.

Does anyone know if those existing boto3/botocore libraries shipped with the lambda are actually used by the runtime? Could we just wipe them from our Dockerfile?

SteggyLeggy commented 1 year ago

What happens is that Lambda Runtime sets the /var/runtime directory in front of /var/lang/lib/python3.7 and populates the pkg_resources.WorkingSet with the distributions installed there (mostly boto3 + deps).

Where does this happen? I mean what component is actually doing this?

Just thinking of making my own docker base image to use for lambda's, but I'd like to be sure that doing so will actually fix this issue.

ilias-at-adarma commented 1 year ago

Where does this happen? I mean what component is actually doing this?

The lambda runtime script runs in /var/runtime which is in the same directory as boto3. From there it creates the lambda listener and at some point does an import of the lambda handler function and to finally run your lambda code upon event receive. The fact it's in the same directory means Python will give priority to import in the same directory. If Python doesn’t find the module in the local directory, it’ll then move onto the paths specified in $PYTHONPATH

SteggyLeggy commented 1 year ago

Oh I see, so this behaviour is more a bi-product of how it is run, and not something done on purpose.

Looking at https://github.com/aws/aws-lambda-python-runtime-interface-client I cannot see any dependency documented on boto3 or botocore. That doesn't of course mean that it doesn't just rely on those libraries already being available though I guess.

Maybe making our own base images, that don't have boto etc installed alongside the lambda runtime interface client will help in this situation.

Looking at the documentation here, it doesn't suggest that boto3 etc are required dependencies either.

https://docs.aws.amazon.com/lambda/latest/dg/images-create.html#images-create-from-alt

aws-haddad commented 1 year ago

Thanks @SteggyLeggy, based on your suggestion I followed "Using an AWS base image for custom runtimes" from here: https://docs.aws.amazon.com/lambda/latest/dg/images-create.html#runtimes-images-custom which worked great.

jtuliani commented 12 months ago

We have published an updated image for Python 3.11 which addresses this issue.

Previously, the Lambda base container images for Python included the /var/runtime directory before the /var/lang/lib/python3.x directory in the search path. This meant that packages in /var/runtime are loaded in preference to packages pip installed into /var/lang/lib/python3.x. Since the AWS SDK for Python (boto3/botocore) was installed into /var/runtime, this made it harder for customers to upgrade the SDK version.

With the Python 3.11 runtime, the AWS SDK and its dependencies are now pre-installed into the /var/lang/lib/python3.11 directory, and the search path has been modified so this directory has precedence over /var/runtime. Customers can override the SDK by pip installing a newer version. This change also enables pip to verify and track that the pre-installed SDK and its dependencies are compatible with any customer-installed packages.

ryancausey commented 10 months ago

Will this fix also be backported to the older lambda python base images that are still supported? I think it should be backported to all the image versions listed here: https://docs.aws.amazon.com/lambda/latest/dg/python-image.html#python-image-base

jtuliani commented 10 months ago

@ryancausey We don't currently plan to back-port this change to the Lambda images for earlier Python versions. It's a (potentially) breaking change and we don't want to break existing customer configurations.