lambci / docker-lambda

Docker images and test runners that replicate the live AWS Lambda environment
MIT License
5.83k stars 431 forks source link

New lambci:build-python3.7 image doesn't seem compatible with aws lambda us-west-2 #272

Closed bclodius closed 4 years ago

bclodius commented 4 years ago

Summary

When taking same code zip package created with latest lambci:build-python3.7 image. I am getting below exception inside us-west-2 deployments. This does not happen with us-east-1 deployments.

I believe this may be a bug with us-west-2 execution environment or us-east-1 is on some canary or something. I am raising an AWS ticket for this.

[ERROR] Runtime.ImportModuleError: Unable to import module 'redacted': No module named 'chardet'

Additional Notes

  1. When I downloaded the code zip from the lambda console us-east-1 and us-west-2 i see that there is a significant size difference in the code zips.
us-east-1 size: 9684109 bytes
us-west-2 size: 9333135 bytes
  1. If I take the downloaded west code zip and upload it to east; it will work fine.

  2. If i take the downloaded east code zip and upload it to west; it will also work fine.

  3. If i reupload the west code zip to west; it will resort back to the aformentioned error message.

Solution

For now I resorted back to an older copy I had of lambci:build-python3.7 I had pulled from 4 months ago and that is resolving my issue 100%.

I am wondering if this is related to some changes in https://github.com/lambci/docker-lambda/commit/4b7e20ab4d9bae5806428dda288bd6ad26db396d

mhart commented 4 years ago

Perhaps related to https://github.com/lambci/docker-lambda/issues/271 ?

mhart commented 4 years ago

Could it be a pipenv bug? https://github.com/pypa/pipenv/issues/3801

mark-at-nuna commented 4 years ago

I have this issue too. I can reproduce it with this lambda_function.py:

import chardet
import json

def lambda_handler(event, context):
    return {
        'statusCode': 200,
        'body': json.dumps('If you see this, the chardet import works.')
    }

Then I run docker run --rm -v "$PWD":/var/task:ro,delegated lambci/lambda:python3.7 lambda_function.lambda_handler and it works fine.

But if I paste the exact same code into a Lambda function in us-west-2, it fails with a Runtime.ImportModuleError.

As bclodius says, it also works on us-east-1. So I think the issue is that chardet is provided in us-east-1, and in this Docker image, but not in us-west-2.

bclodius commented 4 years ago

@mark-at-nuna thanks for helping to confirm; I was going to try this next once I was back to my workstation. I'll be opening a high priority AWS ticket shortly and will share updates I get soon.

mhart commented 4 years ago

I'm starting to think PYTHONPATH might be the issue here – it's set explicitly in the build images to match what happens in production – but it seems that certain tools like virtualenv and pipenv have issues with it when it's set.

mhart commented 4 years ago

Can you try lambci/lambda:build-python3.7-test and see if that fixes your issue?

I know it might be breaking change for some, but if that works I might just remove PYTHONPATH entirely from the build Dockerfile

mhart commented 4 years ago

Basically, it seems that tools like pipenv, if they see that modules like chardet exist in /var/runtime (PYTHONPATH), then they don't install them – and I can't see an easy way to force them to do so?

bclodius commented 4 years ago

@mhart @mark-at-nuna I think there was a rollback or something on AWS side. Some functions of mine that had been failing for a day straight with the import error are now magically working without any redeployment on my end. Still on the ticket.

mhart commented 4 years ago

I don't think it's a rollback per se – I think probably what's happening is that chardet has been added to /var/runtime in us-east-1, which the build image is based on, and therefore pipenv/virtualenv/whatever is deciding not to install it and so it doesn't get bundled up in your zip. This is fine in us-east-1, but not elsewhere where chardet doesn't exist.

Ideally there'd be a way to tell pipenv/virtualenv to always install packages even if they exist in PYTHONPATH, and that way it wouldn't matter if AWS adds/removes system libraries. I see it as a bug in pipenv/virtualenv to be honest, but can't see a good way around it.

So I think the answer has to be to not set PYTHONPATH in all of the build images, and ppl just end up installing (possibly) more packages than they may otherwise need to – but at least they'll be safe from changes on AWS' end

mhart commented 4 years ago

Ah sorry, just read your msg again – if things were failing and then magically working, then I think you're right, they probably have been rolling back (or perhaps rolling out more widely beyond us-east-1)

mhart commented 4 years ago

Have pushed up new python build images that remove PYTHONPATH. The older images (with PYTHONPATH set) can be found at:

I'm hoping this won't break anyone's workflow (it should do the opposite) – although it may mean your bundles are slightly larger, but I think this should be a good thing™

bclodius commented 4 years ago

@mhart Actually i see it still failing in some of my aws accounts using @mark-at-nuna minimal example.

I quickly ran a help("modules") to capture installed system modules. Below are the results which show exactly why this is happening.

east_packages = set(['struct','_dbm','chardet','macpath','subprocess','_decimal','chunk','mailbox','sunau','_dummy_thread','cmath','mailcap','symbol','_elementtree','cmd','marshal','symtable','_functools','code','math','sys','_gdbm','codecs','mimetypes','sysconfig','_hashlib','codeop','mmap','syslog','_heapq','collections','modulefinder','tabnanny','_imp','colorsys','multiprocessing','tarfile','_io','compileall','netrc','telnetlib','_json','concurrent','nis','tempfile','_locale','configparser','nntplib','termios','_lsprof','contextlib','ntpath','test','_lzma','contextvars','nturl2path','test_bootstrap','_markupbase','copy','numbers','test_lambda_runtime_client','_md5','copyreg','opcode','test_lambda_runtime_marshaller','_multibytecodec','crypt','operator','textwrap','_multiprocessing','csv','optparse','this','_opcode','ctypes','os','threading','_operator','curses','ossaudiodev','time','_osx_support','dataclasses','parser','timeit','_pickle','datetime','pathlib','tkinter','_posixsubprocess','dateutil','pdb','token','_py_abc','dbm','pickle','tokenize','_pydecimal','decimal','pickletools','trace','_pyio','difflib','pip','traceback','_queue','dis','pipes','tracemalloc','_random','distutils','pkg_resources','tty','_sha1','doctest','pkgutil','turtle','_sha256','docutils','platform','turtledemo','_sha3','dummy_threading','plistlib','types','_sha512','easy_install','poplib','typing','_signal','email','posix','unicodedata','_sitebuiltins','encodings','posixpath','unittest','_socket','ensurepip','pprint','urllib','_sqlite3','enum','profile','urllib3','_sre','errno','pstats','uu','_ssl','faulthandler','pty','uuid','_stat','fcntl','pwd','venv','_string','filecmp','py_compile','warnings','_strptime','fileinput','pyclbr','wave','_struct','fnmatch','pydoc','weakref','_symtable','formatter','pydoc_data','webbrowser','_sysconfigdata_m_linux_x86_64-linux-gnu','fractions','pyexpat','wsgiref','_testbuffer','ftplib','queue','xdrlib','_testcapi','functools','quopri','xml','_testimportmultiple','gc','random','xmlrpc','_testmultiphase','genericpath','re','xxlimited','_thread','getopt','readline','xxsubtype','_threading_local','getpass','reprlib','zipapp','_tracemalloc','gettext','resource','zipfile','_warnings','glob','rlcompleter','zipimport','_weakref','grp','runpy','zlib','_weakrefset','gzip','s3transfer'])
west_packages = set(['stringprep','_datetime','cgitb','macpath','struct','_dbm','chunk','mailbox','subprocess','_decimal','cmath','mailcap','sunau','_dummy_thread','cmd','marshal','symbol','_elementtree','code','math','symtable','_functools','codecs','mimetypes','sys','_gdbm','codeop','mmap','sysconfig','_hashlib','collections','modulefinder','syslog','_heapq','colorsys','multiprocessing','tabnanny','_imp','compileall','netrc','tarfile','_io','concurrent','nis','telnetlib','_json','configparser','nntplib','tempfile','_locale','contextlib','ntpath','termios','_lsprof','contextvars','nturl2path','test','_lzma','copy','numbers','test_bootstrap','_markupbase','copyreg','opcode','test_lambda_runtime_client','_md5','crypt','operator','test_lambda_runtime_marshaller','_multibytecodec','csv','optparse','textwrap','_multiprocessing','ctypes','os','this','_opcode','curses','ossaudiodev','threading','_operator','dataclasses','parser','time','_osx_support','datetime','pathlib','timeit','_pickle','dateutil','pdb','tkinter','_posixsubprocess','dbm','pickle','token','_py_abc','decimal','pickletools','tokenize','_pydecimal','difflib','pip','trace','_pyio','dis','pipes','traceback','_queue','distutils','pkg_resources','tracemalloc','_random','doctest','pkgutil','tty','_sha1','docutils','platform','turtle','_sha256','dummy_threading','plistlib','turtledemo','_sha3','easy_install','poplib','types','_sha512','email','posix','typing','_signal','encodings','posixpath','unicodedata','_sitebuiltins','ensurepip','pprint','unittest','_socket','enum','profile','urllib','_sqlite3','errno','pstats','urllib3','_sre','faulthandler','pty','uu','_ssl','fcntl','pwd','uuid','_stat','filecmp','py_compile','venv','_string','fileinput','pyclbr','warnings','_strptime','fnmatch','pydoc','wave','_struct','formatter','pydoc_data','weakref','_symtable','fractions','pyexpat','webbrowser','_sysconfigdata_m_linux_x86_64-linux-gnu','ftplib','queue','wsgiref','_testbuffer','functools','quopri','xdrlib','_testcapi','gc','random','xml','_testimportmultiple','genericpath','re','xmlrpc','_testmultiphase','getopt','readline','xxlimited','_thread','getpass','reprlib','xxsubtype','_threading_local','gettext','resource','zipapp','_tracemalloc','glob','rlcompleter','zipfile','_warnings','grp','runpy','zipimport','_weakref','gzip','s3transfer','zlib'])

print(f'East packages count: {len(east_packages)}')
print(f'West packages count: {len(west_packages)}')
print(f'Only in east: {east_packages.difference(west_packages)}')
print(f'Only in west: {west_packages.difference(east_packages)}')

East packages count: 216
West packages count: 217
Only in east: {'_weakrefset', 'chardet'}
Only in west: {'_datetime', 'stringprep', 'cgitb'}
mhart commented 4 years ago

@bclodius you've pulled the latest build image? It should have this sha: https://hub.docker.com/layers/lambci/lambda/build-python3.7/images/sha256-e9046d2e685c9684cd5c1cc5eed078351042176c2a0fc7f73fe5bf487876ac3e

bclodius commented 4 years ago

@mhart my comment was more around the fact that AWS still has issue in some of their accounts so I captured the environment directly from AWS console. I can help test your image though in short term.

mhart commented 4 years ago

Ah, gotcha. Gonna close this as fixed by https://github.com/lambci/docker-lambda/commit/e69aa8b1f5ed85938e2be61db0ed9f574e17dd58

bclodius commented 4 years ago

@mhart Just circling back; I tested latest image and it did isolate me from the errors introduced on AWS side.

Thanks for the very quick follow up, discussion, and solution.

A known side effect of this is that the code zip is a little bit larger; but a small price to pay to isolate ourselves from environmental inconsistencies.

Cheers 🥂