OpenDRR / pygeoapi

pygeoapi is a Python server implementation of the OGC API suite of standards. The project emerged as part of the next generation OGC API efforts in 2018 and provides the capability for organizations to deploy a RESTful OGC API endpoint using OpenAPI, GeoJSON, and HTML. pygeoapi is open source and released under an MIT license.
https://pygeoapi.io
MIT License
0 stars 1 forks source link

Docker image build fails when the flag '--no-binary pydantic' is added to requirements.txt #15

Open arashmalekz opened 2 years ago

arashmalekz commented 2 years ago

The flag '--no-binary pydantic' is added to requirements.txt on a new line so it doesn't generate pydantic binaries and make the Lambda deployment package smaller. However, if I have this flag and try to build the Pygeoapi Docker image, I get the following error:

Obtaining file:///pygeoapi
    ERROR: Command errored out with exit status 1:
     command: /usr/bin/python3 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/pygeoapi/setup.py'"'"'; __file__='"'"'/pygeoapi/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info
         cwd: /pygeoapi/
    Complete output (1 lines):
    error in pygeoapi setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers; Invalid requirement, parse error at "'--no-bin'"
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
anthonyfok commented 2 years ago

Hey @arashmalekz, thank you for your very helpful report! I was able to reproduce the error that you were seeing, and it turns out pip3 install -e . which runs setup.py is the "culprit". So, after removing the --no-binary pydantic from requirements.txt, I borrowed a trick inside Dockerfile (where upstream pre-installed elasticsearch-dsl with pip3 for some reason):

--- a/Dockerfile
+++ b/Dockerfile
@@ -92,6 +92,8 @@ RUN \
     && if [ "$BUILD_DEV_IMAGE" = "true" ] ; then pip3 install -r requirements-dev.txt; fi \
     # Temporary fix for elasticsearch-dsl module not available as deb package in bionic
     && pip3 install elasticsearch-dsl \
+    # Reduce size of Docker image by not installing pydantic binary package
+    && pip3 install --no-binary pydantic pydantic \
     && pip3 install -e . \
     # OGC schemas local setup
     && mkdir /schemas.opengis.net \

pip3 install -e . would then see that pydantic is already installed (in whatever form) and would leave it alone, thus skipping the binary wheel file download. See also PR #17 if we want this in the master branch too.

Hope this will make the Docker image small enough (969MB → 861MB before compression) for AWS Lambda. Please test and let me know! If it is still too large, I think there could be some more tricks to reduce the size of the Docker image further.

arashmalekz commented 2 years ago

Hey @anthonyfok , sorry, I finally had a chance to look at this! I'm using the Docker image to generate pygeoapi and openapi config. The package I build for Lambda is not actually Docker image. I'm using the serverless framwork (aws-lambda folder in this repo). It reads requirements.txt file and installs packages locally and then creates a Lambda zip file. I think what I can do is to create a requirements.txt file specific for serverless builds and get it to use that. This way we could leave Dockerfile as is. Let me see if I can do it that way and I'll let you know.

anthonyfok commented 2 years ago

Hey @arashmalekz, thank you so much for looking into this! (And sorry for my late reply.)

Joost reminded me that we don't deploy pygeoapi on AWS Lambda any longer. Instead a full pygeoapi container is now used to allow larger records (e.g. more than 10 items) to be sent to clients (something like that, if I recall correctly?).

If so, I was wondering if we could close this issue? Please let me know! Many thanks!