WikiWatershed / mmw-tiler

Tiler for Model My Watershed
Apache License 2.0
0 stars 0 forks source link

AWS 2-6: Generate, Store Static Tiles for Lower Zoom Levels #9

Open rajadain opened 3 weeks ago

rajadain commented 3 weeks ago

TiTiler is optimized for tiling images at higher zoom levels, where only a small portion of a tile would be read, and in most cases a single tile. At lower zoom levels, a much larger number of tiles have to be read, resulting in long wait times which will exceed the built-in timeouts (30s for API Gateway, and 15m for Lambda).

To get around this, we should create static tiles for the lower zoom levels and upload them to S3. Then, we can have a proxy which can serve the lower zoom levels from the static tile bucket, and the higher levels via normal TiTiler endpoints.

For this card:

rajadain commented 1 week ago

This notebook guides how to generate static tiles locally: https://github.com/WikiWatershed/mmw-tiler/blob/main/mosaic.ipynb

rajadain commented 1 week ago

Here's the RGBA colormap for this dataset:

{ 0: {'color': '#00000000', 'label': 'No Data'},
  1: {'color': '#419BDFFF', 'label': 'Water'},
  2: {'color': '#397D49FF', 'label': 'Trees'},
  4: {'color': '#7A87C6FF', 'label': 'Flooded vegetation'},
  5: {'color': '#E49635FF', 'label': 'Crops'},
  7: {'color': '#C4281BFF', 'label': 'Built area'},
  8: {'color': '#A59B8FFF', 'label': 'Bare ground'},
  9: {'color': '#A8EBFFFF', 'label': 'Snow/ice'},
 10: {'color': '#616161FF', 'label': 'Clouds'},
 11: {'color': '#E3E2C3FF', 'label': 'Rangeland'} }
rajadain commented 1 week ago

I updated the comment above to indicate that we should put the tiles in tile-cache.staging.modelmywatershed.org in US-EAST-1

KlaasH commented 2 days ago

Since I'll be away for a week (returning 7/5), a writeup of what I've been trying to do on this and where it has taken me so far: First I ran through the Jupyter notebook, generating a new mosaic just for 2019. It worked more-or-less as written, producing files in tiles/ and also generating quite a few errors from the tile-saving step. The tiles in the directory have no extension and there's only one in each directory. They're 256x256, i.e. 1x. I'm not sure if the 2x got generated and overwritten or didn't get generated.

Then I switched gears to trying to get titiler-mosaicjson running locally, toward the goal of figuring out where the branch point for static vs. dynamic should go. To that end, I first spun up a local MMW instance. That went fine except I had to change celery_version to 5.2.7 because pip had a problem with the 5.2.0 package.

I added a layer config to layer_settings.py for the staging layer I had generated:

    {
        'display': 'Test Mosaic Land Cover',
        'code': 'test_mosaic-2019_2019',
        'css_class_prefix': 'test-2019-10m nlcd',
        'short_display': 'MosaicTest 2019',
        'helptext': 'Land cover maybe.',
        'url': 'https://tiler.staging.modelmywatershed.org/mosaicjson/mosaics/cd9149fb-831b-448d-980c-c3c2aa6291c1/tiles/{z}/{x}/{y}@2x.png?colormap=%7B%220%22%3A%20%22%23000000%22%2C%20%221%22%3A%20%22%23419bdf%22%2C%20%222%22%3A%20%22%23397d49%22%2C%20%223%22%3A%20%22%23000000%22%2C%20%224%22%3A%20%22%237a87c6%22%2C%20%225%22%3A%20%22%23e49635%22%2C%20%226%22%3A%20%22%23000000%22%2C%20%227%22%3A%20%22%23c4281b%22%2C%20%228%22%3A%20%22%23a59b8f%22%2C%20%229%22%3A%20%22%23a8ebff%22%2C%20%2210%22%3A%20%22%23616161%22%2C%20%2211%22%3A%20%22%23e3e2c3%22%7D',
        'maxNativeZoom': 13,
        'maxZoom': 18,
        'opacity': 0.618,
        'has_opacity_slider': True,
        'legend_mapping': { key: names[1] for key, names in NLCD.items()},
        'big_cz': True,
    },

That worked, i.e. I could get tiles by activating that layer.

To try to replace that staging endpoint with a local one I could modify and debug with, I checked out titiler-mosaicjson and started it up locally.

Running titiler-mosaicjson

The README describes global installation of the dependencies. There is a Docker build in the Filmdrop Terraform config, but just running it on host in a venv seemed like the simplest way to go.

python3.10 -m venv .venv
source .venv/bin/activate
python -m pip install boto3
python -m pip install -e src/titiler/core -e src/titiler/extensions -e src /titiler/mosaic -e src/titiler/application

The addition of boto3 was to use DynamoDB, which I ended up not doing, but it'll be useful for talking to S3.

DynamoDB local

Running a local DynamoDB for dev/debugging is simple:

docker run -p 8333:8000 amazon/dynamodb-local

So is making a table:

aws dynamodb create-table --table-name mosaicjson --attribute-definitions AttributeName=mosaicId,AttributeType=S AttributeName=quadkey,AttributeType=S --key-schema AttributeName=mosaicId,KeyType=HASH AttributeName=quadkey,KeyType=RANGE --billing-mode PAY_PER_REQUEST --endpoint-url http://localhost:8333

Then I ran the titiler-mosaicjson server with

export AWS_REQUEST_PAYER=requester
export MOSAIC_BACKEND=dynamodb://
export MOSAIC_HOST=/localhost:8333/mosaicjson
uvicorn --port 8444 titiler.application.main:app --reload

but I didn't have any luck getting the server to connect to the DynamoDB container. It kept saying "Invalid DynamoDB path". No doubt it can be done, but I realized titiler-mosaicjson could also run with a file backend, so I fell back to using that (by just running it without those environment variables. It defaults to file backend).

mosaic creation

After starting the titiler service:

uvicorn --port 8444 titiler.application.main:app --reload

I sent a request, adapted from the README, to it to make a mosaic:

curl -X "POST" "http://127.0.0.1:8444/mosaicjson/mosaics" \
     -H 'Content-Type: application/vnd.titiler.stac-api-query+json' \
     -d $'{
  "stac_api_root": "https://api.impactobservatory.com/stac-aws",
  "max_items": 100,
  "bbox": [-86.28392871498, 42.712975131967426, -85.5220226, 43.1163245],
  "datetime": "2020-12-31T00:00:00Z/2022-12-31T23:59:59.999Z",
  "collections": [
    "io-10m-annual-lulc"
  ],
  "asset_name": "supercell"
}'

Swapping the resulting tile URL template in for the url in the test layer sort of worked. Or rather mostly not, but I think the problem was that the local server didn't handle the load well and most of the requests timed out. Loading individual tiles generally worked.

Next steps

The goal of the above was to get a view into how the TiTiler service handles tile requests and figure out where to attach a "read generated tiles from S3" step. Since it encodes the request parameters, pulling cached files rather than generating new ones will either have to happen inside the service, after it has made that translation, or we'll need to borrow or reproduce the encoding logic to identify the right S3 path for a given tile. I didn't quite get to where I was trying to go yet, but hopefully I'm on the right track and not far off.

The Jupyter notebook makes sense as a strategy for generating the tiles. I think the errors might be caused by tile_url getting shared between loops but modified inside the loop for the last step, so that should be easy to resolve. Depending on how we want to attach the cache, we might also want to change the filenames in that loop.