Open donnyv opened 5 years ago
Fair enough.
Use the steps in the Dockerfile
as instructions to translate to your target environment ;-) Those are for Ubuntu 16.04.
Short version:
pip install -r requirements-server.txt
gunicorn -k gevent virtual.web:app
You'll almost certainly need to install some dependencies for pip
to be able to install everything correctly.
I'm assuming you haven't tried this with Windows?
Nope, sorry. I develop locally on Ubuntu using Gunicorn or Flask's built-in web server and usually deploy to AWS Lambda.
I'm not aware of any specific reasons that it wouldn't work (since rasterio is known to work on Windows), but it's been years since I've used Python on Windows.
How hard would it be to switch out Gunicorn with nginx?
Nginx is usually configured as a reverse proxy in front of something that speaks HTTP (like Gunicorn) or WSGI (like uWSGI), so I don't think you'd actually swap them; more that you'd layer Nginx in front and change out the application server.
This looks helpful: https://www.nginx.com/blog/maximizing-python-performance-with-nginx-parti-web-serving-and-caching/
Yeah I've done that before. My worry is with Gunicorn. These performance numbers are making me a little uncomfortable. Even with Nginx in front of it I would probably need to load balance it with multiple instances for large loads. How has the performance been for you on AWS Lambda? https://www.appdynamics.com/blog/engineering/a-performance-analysis-of-python-wsgi-servers-part-2/
Alas, if that were the worst of the performance numbers. marblecutter (and rasterio / GDAL by extension) is by far a greater bottleneck, especially when rendering tiles out of remote COGs.
It's not uncommon for tile requests to take 1s or more with an empty cache.
CPU is a consideration (especially when data needs to be reprojected), but the biggest driver is network / storage latency. GDAL attempts to minimize the number of upstream requests (and to parallelize them where possible), but even so, there are usually 2-3 round-trips to find the IFD, read the IFD, and request overlapping regions of the source image. If latency is 100ms (not uncommon with S3), that's immediately lots (most server processes will spend much of their time waiting for upstream data). Rendering from a local fileserver (not using remote sources) or from an endpoint with lower latency should help more than any tuning of your application server.
Lambda ends up working really well for this, in part because of its single-invocation per request model, but more because it can scale out almost instantaneously to absorb spikes. I typically run a CloudFront cache in front, so popular areas don't need to be constantly re-rendered (configuring Nginx as a cache would help immensely here, if certain regions are commonly requested).
If your data is on a publicly-accessible HTTP endpoint, give tiles.rdnt.io a shot to see how it performs. That's deployed on Lambda (with 1536MB allocated, mainly for the corresponding CPU increase) and will attempt to minimize by rendering from an AWS region closest to your data (only determinable for S3-hosted data right now).
If pre-rendering to tiles is feasible (small target area, uniform region popularity, requirement for low latency), that's ideal. Otherwise, the trade-off with marblecutter-virtual (and friends) is that initial tile requests will be slow in exchange for having access to extremely large regions immediately.
Thanks for the great write up! You gave me a lot to think about.
Not a fan of docker. How would I build and deploy without it?