Miserlou / Zappa

Serverless Python
https://blog.zappa.io/
MIT License
11.89k stars 1.2k forks source link

slim_handler: true fails in Windows, possibly due to file slashes in tarball #2145

Open CaseGuide opened 4 years ago

CaseGuide commented 4 years ago

It is not possible to use Zappa on Windows with slim_handler = true. This is likely due to the slashes used when tarring the file. This issue is well known, probably easy to fix, and has cropped up in #1870 and #1358 with a PR for #1358 in #1570 that seemed to not get integrated.

Context

Using venv and Python 3.6 on Widows 10, I was able to successfully deploy a project and get the expected results. Upon installing pandas, my lambda project exceeded what Lambda would accept. After pip install pandas and the failed deployment I changes zappa_settings.json to add slim_handler : true, then deployed. I'm now encountering an issue returning the error errorMessage": "Unable to import module 'index'". My handler function is a regular python function that is in index.py and was otherwise working fine.

Expected Behavior

Expect this deployment to be successful and return a 200 and some JSON via my API Gateway

Actual Behavior

I receive a 502 {"message": "Internal server error"} from the API Gateway endpoint and zappa tail returns many instances of: [1596019020990] Unable to import module 'index': No module named 'index'. My function is in index.py. "lambda_handler": "index.handler",

Possible Fix

Steps to Reproduce

  1. Create venv, pip install zappa, etc. on a Windows 10 machine
  2. zappa deploy with "slim_handler": true, "delete_local_zip": false,
  3. Run zappa tail expected output should be something like: Unable to import module <handler_function>: No module named <your import>
  4. Using the .tar.gz created on Windows, list the files in it using tar --list --verbose --file=<project>-<env>-<time>.tar.gz on Linux. The output, in my case from WSL Ubuntu, proves that there are indeed mixed slashes
    -rw-rw-rw- 0/0               0 2020-07-28 16:54 djangocms_icon/migrations/__pycache__/__init__.py
    -rwxr-xr-x 0/0            4610 2020-07-28 16:54 djangocms_icon\\static\\djangocms_icon\\css\\djangocms-icon.css
    -rwxr-xr-x 0/0             274 2020-07-28 16:54 djangocms_icon\\static\\djangocms_icon\\js\\base.js
    [...]
    -rwxr-xr-x 0/0            2670 2020-07-28 16:54 ~umpy\\tests\\__pycache__\\test_warnings.cpython-36.pyc
    -rwxr-xr-x 0/0             168 2020-07-28 16:54 ~umpy\\tests\\__pycache__\\__init__.cpython-36.pyc
    -rw-rw-rw- 0/0               0 2020-07-28 16:54 ~umpy/tests/__pycache__/__init__.py
    [...]
    -rwxr-xr-x 0/0            8140 2020-07-28 16:54 ~umpy\\__pycache__\\__init__.cpython-36.pyc
    -rw-rw-rw- 0/0               0 2020-07-28 16:54 ~umpy/__pycache__/__init__.py
    [...and so on]

Your Environment

Django==2.0.6 mysqlclient==1.3.12 pymysql

Created by djangocms-installer

django-cms>=3.7,<3.8 djangocms-admin-style>=1.5,<1.6 django-treebeard>=4.0,<5.0 djangocms-text-ckeditor>=3.7,<4.0 djangocms-link>=2.5,<2.7 djangocms-icon>=1.4,<1.6 djangocms-style>=2.2,<2.4 djangocms-googlemap>=1.3,<1.5 djangocms-snippet>=2.2,<2.4 djangocms-video>=2.1,<2.4 djangocms-file>=2.3,<2.5 djangocms-picture>=2.3,<2.5 djangocms-bootstrap4>=1.5,<1.7 easy_thumbnails django-filer>=1.3 django-classy-tags>=0.9 django-sekizai>=1.0 django-mptt>0.9 html5lib>=1.0.1 Pillow>=3.0 six pytz

Added later

djangorestframework django-s3-storage==0.13.2

What I'm using for this part of the project.

selenium==3.13.0 pandas==0.25.1 # This version of pandas is needed for compatibility with zappa's python-dateutils requirement python-dateutil==2.6.1 # This overlaps the reqs of zappa 0.51.0 and pandas 0.25.1

* Your `zappa_settings.json`: 

{ "dev": { "aws_region": "us-east-1", "profile_name": "default", "project_name": "", "runtime": "python3.6", "s3_bucket": "", "use_precompiled_packages": true, "lambda_handler": "index.handler", "memory_size": 2048, "timeout_seconds": 300, "cloudwatch_log_level": "INFO", "slim_handler": true, "delete_local_zip": false, // Delete the local zip archive after code updates. Default true. "delete_s3_zip": false // Delete the s3 zip archive. Default true. } }


EDIT: I noted that this issue doesn't seem to affect the zip file. Running `unzip -l handler_<project>-<env>-<time>.zip` in WSL on the Windows-created handler zip works without issue. See output:
25185  1980-01-01 00:00   troposphere-2.6.2.dist-info/METADATA
18383  1980-01-01 00:00   troposphere-2.6.2.dist-info/RECORD
   12  1980-01-01 00:00   troposphere-2.6.2.dist-info/top_level.txt
   97  1980-01-01 00:00   troposphere-2.6.2.dist-info/WHEEL
14461  1980-01-01 00:00   urllib3/connection.py
35725  1980-01-01 00:00   urllib3/connectionpool.py
 7172  1980-01-01 00:00   urllib3/exceptions.py
 8553  1980-01-01 00:00   urllib3/fields.py
CaseGuide commented 4 years ago

I note that the reasoning for using a tarball is discussed in https://github.com/Miserlou/Zappa/pull/1022 and https://github.com/Miserlou/Zappa/pull/1037. Since the file slash issue is fundamental to the tarfile python module, and there are good reasons for using a tarfile (streamed unzipping to/tmp seems to be the primary one), I'll go down the "fix the slashes in the tarball" path.

I confirmed that the line tarinfo = tarfile.TarInfo(os.path.join(root.replace(temp_project_path, '').lstrip(os.sep), filename)) in core.py is a source of this. It looks like replacing the separators in the string and path properties of the TarInfo object might fix it.

Livestream of edits as I try stuff...

  1. I downloaded the current (0.51.0) Zappa code zip
  2. pip uninstall zappa
  3. nav to zappa code, pip install -e .
  4. Got invalid characters in Readme, erased readme, pip install -e . success.

Replace the slashes in the tarball

Tried replacing the slashes in zappa\cli.py:

                    with open(os.path.join(root, filename), 'rb') as f:
                        if os.sep == '\\':
                            tarinfo.path = tarinfo.path.replace(os.sep, '/')
                        archivef.addfile(tarinfo, f)

Listing the tar contents in WSL now shows the correct slashes:

-rwxr-xr-x 0/0          342624 2020-07-29 07:38 pandas/_libs/tslibs/tzconversion.cpython-36m-x86_64-linux-gnu.so
-rwxr-xr-x 0/0             388 2020-07-29 07:38 pandas/_libs/tslibs/__init__.py
-rwxr-xr-x 0/0              69 2020-07-29 07:38 pandas-0.25.1.dist-info/entry_points.txt
-rwxr-xr-x 0/0               4 2020-07-29 07:37 pandas-0.25.1.dist-info/INSTALLER
-rwxr-xr-x 0/0            1582 2020-07-29 07:37 pandas-0.25.1.dist-info/LICENSE

However I am getting the same error.

Turn it off and then on again

Trying zappa undeploy production --remove-logs then zappa deploy because turning it on and off again definitely applies to software development.

Still failed with [1596025336429] Unable to import module 'index': No module named 'index' in tail.

Make it a zip file instead of a tar, performance decrease be damned.

Edited core.py to use a zip, changed the file extensions in core.py filenames form .tar.gz to .zip. Edited handler.py to use zipfile if .zip in name.

It looks like many errors were begin suppressed when I had "lambda_handler": "index.handler",. Removing that revealed some issues with unzipping the file.

Hooray a new error, now I'm running out of disk space! OSError: [Errno 28] No space left on device

Forked my repo and made a lighter weight venv

The repo is now right on the edge of size for needing slim_handler. I successfully deployed the code with "slim_handler": false, but am confident that the next few bits of work I will have to do will cross the threshold.

Still unable to deploy with slim_handler. When I deploy with slim_handler true and lambda_handler commented out I see an error that appears to be related to this issue https://github.com/Miserlou/Zappa/issues/1834#issuecomment-632994410

Adding "include" : [] didn't fix the issue. I attempted the fix described in #1834 adding the following to /zappa/init.py:

import pymysql
pymysql.install_as_MySQLdb()
pymysql.version_info = (1, 3, 13, 'final', 0)

....even though this project doesn't use a DB. Was greated with a new error: Unable to import module 'handler': attempted relative import with no known parent package

CaseGuide commented 4 years ago

I'm giving up for today. It looks like slim_handler is broken, but it's not clear to me why and may not be for the reason I posted about.

CaseGuide commented 4 years ago

I was able to make it work on Windows Subsystem Linux Ubuntu by creating a new venv, clean installing my requirements and setting the .tar.gz in S3 to public. Obviously this is insecure and won't continue for long, but its an indicator to anyone who lands here to check S3 permissions. Still no luck deploying from Windows, sadly.

CloudWatchLogs on Windows:

2020-08-26T14:30:20.205-04:00
[DEBUG] 2020-08-26T18:30:20.204Z 42a0da36830bbf4 Response: {'ResponseMetadata': {'RequestId': '13E105411C6A12E6', 'HostId': 'Wu6QXOk/Roihaflit3n61xY=', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amz-id-2': 'Wu6QXOk/Roin88tC03n61xY=', 'x-amz-request-id': '13E112E6', 'date': 'Wed, 26 Aug 2020 18:30:21 GMT', 'last-modified': 'Wed, 26 Aug 2020 18:30:12 GMT', 'etag': '"8b7a3dced2f7283208-10"', 'accept-ranges': 'bytes', 'content-type': 'binary/octet-stream', 'content-length': '75848129', 'server': 'AmazonS3'}, 'RetryAttempts': 0}, 'AcceptRanges': 'bytes', 'LastModified': datetime.datetime(2020, 8, 26, 18, 30, 12, tzinfo=tzutc()), 'ContentLength': 75848129, 'ETag': '"8b7a34a2349620a83208-10"', 'ContentType': 'binary/octet-stream', 'Metadata': {}, 'Body': <botocore.response.StreamingBody object at 0x7fe9d665dd68>}

2020-08-26T14:30:31.943-04:00
Failed to find library: libmysqlclient.so.18...right filename?

2020-08-26T14:30:31.986-04:00
No module named 'django.core.wsgi': ModuleNotFoundError Traceback (most recent call last): File "/var/task/handler.py", line 609, in lambda_handler return LambdaHandler.lambda_handler(event, context) File "/var/task/handler.py", line 240, in lambda_handler handler = cls() File "/var/task/handler.py", line 146, in __init__ wsgi_app_function = get_django_wsgi(self.settings.DJANGO_SETTINGS) File "/var/task/zappa/ext/django_zappa.py", line 9, in get_django_wsgi from django.core.wsgi import get_wsgi_application ModuleNotFoundError: No module named 'django.core.wsgi'

I'm completely sure that its able to access the .tar.gz

CaseGuide commented 4 years ago

I was unsuccessful from Win 10, switched over to WSL/the venv for WSL changing nothing else, and successfully deployed from WSL.

LaundroMat commented 3 years ago

I'm going the WSL route too, but FWIW, I managed to upload a 89Mb package without setting slim_handler: true and it worked.