developmentseed / covid-wb-api

COVID-19 Risk Schema API for the World Bank
MIT License
0 stars 1 forks source link

COG requests need rescale param (PNG driver doesn't support data type Float64) #25

Closed geohacker closed 4 years ago

geohacker commented 4 years ago

Over in https://github.com/developmentseed/covid-wb-api/pull/22#issuecomment-655802770, @guidorice and I were able to confirm that wp2020_vulnerability_map.tif, wp_2020_1km.tif and wp_2020_1km_urban_pop.tif are causing a 500 from titiler.

For example, if you do curl http://covid-publi-131wmiy217ice-1414663655.us-east-1.elb.amazonaws.com/cog/wp_2020_1km/tiles/13/5865/3796.png this returns Internal Server Error. In the cloudwatch logs, I can see the following error:


2020-07-09T14:04:54.875+05:30 | DEBUG:rasterio._io:Path: UnparsedPath(path='/vsimem/a154a60a-1b3a-4484-8430-ba420692b1f9/a154a60a-1b3a-4484-8430-ba420692b1f9.'), mode: w+, driver: PNG
-- | --
  | 2020-07-09T14:04:54.875+05:30 | DEBUG:rasterio._base:Nodata success: 0, Nodata value: 0.000000
  | 2020-07-09T14:04:54.875+05:30 | DEBUG:rasterio._base:Nodata success: 0, Nodata value: 0.000000
  | 2020-07-09T14:04:54.876+05:30 | DEBUG:rasterio._io:Skipped delete for overwrite. Dataset does not exist: /vsimem/a154a60a-1b3a-4484-8430-ba420692b1f9/a154a60a-1b3a-4484-8430-ba420692b1f9.
  | 2020-07-09T14:04:54.876+05:30 | DEBUG:rasterio._io:Option: ('ZLEVEL', b'6')
  | 2020-07-09T14:04:54.876+05:30 | DEBUG:rasterio.env:Exiting env context: <rasterio.env.Env object at 0x7ff0f84b8040>
  | 2020-07-09T14:04:54.876+05:30 | DEBUG:rasterio.env:Cleared existing <rasterio._env.GDALEnv object at 0x7ff0f84b88e0> options
  | 2020-07-09T14:04:54.876+05:30 | DEBUG:rasterio._env:Stopped GDALEnv <rasterio._env.GDALEnv object at 0x7ff0f84b88e0>.
  | 2020-07-09T14:04:54.877+05:30 | DEBUG:rasterio.env:Exiting outermost env
  | 2020-07-09T14:04:54.877+05:30 | DEBUG:rasterio.env:Exited env context: <rasterio.env.Env object at 
...
....
....
  | 2020-07-09T14:04:54.877+05:30 | return await dependant.call(**values)
  | 2020-07-09T14:04:54.877+05:30 | File "/covidwb/app/routers/titiler_router.py", line 207, in cog_tile
  | 2020-07-09T14:04:54.877+05:30 | content = render(
  | 2020-07-09T14:04:54.877+05:30 | File "/usr/local/lib/python3.8/dist-packages/rio_tiler/utils.py", line 418, in render
  | 2020-07-09T14:04:54.877+05:30 | dst.write(mask.astype(tile.dtype), indexes=count + 1)
  | 2020-07-09T14:04:54.877+05:30 | File "rasterio/_base.pyx", line 332, in rasterio._base.DatasetBase.__exit__
  | 2020-07-09T14:04:54.877+05:30 | File "rasterio/_base.pyx", line 322, in rasterio._base.DatasetBase.close
  | 2020-07-09T14:04:54.877+05:30 | File "rasterio/_io.pyx", line 2077, in rasterio._io.BufferedDatasetWriterBase.stop
  | 2020-07-09T14:04:54.877+05:30 | File "rasterio/_err.pyx", line 205, in rasterio._err.exc_wrap_pointer
  | 2020-07-09T14:04:54.877+05:30 | rasterio._err.CPLE_NotSupportedError: PNG driver doesn't support data type Float64. Only eight bit (Byte) and sixteen bit (UInt16) bands supported.

cc @bitner @pieschker

geohacker commented 4 years ago

I'll try a few things to see what's going on.

geohacker commented 4 years ago

Ha, actually it does work, when I do something like http://covid-publi-131wmiy217ice-1414663655.us-east-1.elb.amazonaws.com/cog/wp_2020_1km_urban_pop/tiles/1/1/1.png?bidx=1&rescale=0,2

Looking closely at the requests while running titiler locally, and supplying the rescale and bidx params. It does seem to work.

geohacker commented 4 years ago

I'm guessing what the rescale is doing is scaling the input domain to the range we specify. https://gdal.org/programs/gdal_translate.html#cmdoption-gdal-translate-scale

guidorice commented 4 years ago

In today's scrum we decided it would be better if the client (browser) did not need to specify bidx and rescale query string parameters. Ideally there would be

  1. a visualization COG that is compatible with the PNG driver, for web usage, (convert to 16bit uint). The visualization COG also should have a nice color ramp baked in.
  2. a data COG that has the original data in Float64 format

cc @geohacker @bitner @pieschker

pieschker commented 4 years ago

Wouldn't it be a COP, not a COG at that point?

guidorice commented 4 years ago

Cloud optimized PNG? Yeah, we could register copeo.org ? 🤔
Please elaborate if you have concerns or different action plan. thx cc @pieschker @bitner @geohacker

guidorice commented 4 years ago

The visualization COG also should have a nice color ramp baked in.

Although, color ramp != color map. titiler allow color maps to be defined in it's config https://github.com/developmentseed/titiler/tree/master/docs#color-maps

I need to go through each of these and decide if each one is Classification values (color map) or Continuous values (color ramp). If someone already knows, please add a comment here.

cc @bitner @geohacker @pieschker

guidorice commented 4 years ago

Getting a bit confused on terminology too because of color_formula vs color_map https://github.com/developmentseed/titiler/blob/master/docs/COG.md color_map cfastie was also mentioned Slack . 🤔

guidorice commented 4 years ago

I filled in the raster layer details above. (source https://github.com/worldbank/HNP/wiki/Data-Extraction-for-Dashboard-creation#tiff-package)

Current plan

We will do the coloring in titiler because it supports easy ways to define a bitmap color table (for LandCover), and also supports cfastie and other common color formulas which will be easy to change out. (we probably want to use a diverging color scheme on the continous value layers).

I am modifying load_tifs.sh to translate the float64 layers into Uint16 cogs, which will have less friction with the titiler stack.

cc @geohacker @bitner @pieschker

guidorice commented 4 years ago

It looks like the 3 raster layers having continuous values (wp_2020_1km, WP_2020_1km_urban_pop, WP2020_vulnerability_map) are not normalized to range of [-1.0,1.0]. Rather they are zero-based and maxing out at some arbitrary float number. My interpretation is these are not suitable for a diverging color scheme such as cfastie. For this iteration, I will try one of the other builtin titiler color schemes, like viridis or magma.

processing.run("qgis:rasterlayerstatistics", { 'INPUT': '/Users/alex/repos/covid-wb/s3-data/home/wb411133/data/Projects/CoVID/KEN/WP2020_vulnerability_map.tif', 'BAND': 1})
{'MAX': 1370.6923828125, 'MEAN': 0.9884854401589904, 'MIN': 0.0, 'RANGE': 1370.6923828125, 'STD_DEV': 5.82703651077801, 'SUM': 661674.3609045054, 'SUM_OF_SQUARES': 22728399.768185552}
processing.run("qgis:rasterlayerstatistics", { 'INPUT': '/Users/alex/repos/covid-wb/s3-data/home/wb411133/data/Projects/CoVID/KEN/WP_2020_1km.tif', 'BAND': 1})
{'MAX': 143742.84375, 'MEAN': 82.09278176709948, 'MIN': 0.0, 'RANGE': 143742.84375, 'STD_DEV': 562.3922669420514, 'SUM': 54951430.444824584, 'SUM_OF_SQUARES': 211715211030.541}
processing.run("qgis:rasterlayerstatistics", { 'INPUT': '/Users/alex/repos/covid-wb/s3-data/home/wb411133/data/Projects/CoVID/KEN/WP_2020_1km_urban_pop.tif', 'BAND': 1})
{'MAX': 143742.84375, 'MEAN': 9.452564410674107, 'MIN': -0.0, 'RANGE': 143742.84375, 'STD_DEV': 420.89467855908333, 'SUM': 10146732.383300781, 'SUM_OF_SQUARES': 190161688977.4985}

cc @geohacker @pieschker @bitner

guidorice commented 4 years ago

cc @pieschker @geohacker @bitner

guidorice commented 4 years ago

Raster update

> aws s3 ls s3://covid-wb/merged_tifs --recursive
2020-07-13 12:51:43          0 merged_tifs/2020-06-24/
2020-07-13 12:54:50  578302517 merged_tifs/2020-06-24/lc.tif
2020-07-13 12:54:54  983182331 merged_tifs/2020-06-24/wp2020_vulnerability_map.tif
2020-07-13 12:55:00 1236963880 merged_tifs/2020-06-24/wp_2020_1km.tif
2020-07-13 12:54:59   64597113 merged_tifs/2020-06-24/wp_2020_1km_urban_pop.tif
2020-07-13 12:56:31   72242665 merged_tifs/lc.tif
2020-07-13 16:01:39   46283379 merged_tifs/wp2020_vulnerability_map.tif
2020-07-13 16:00:52  968306219 merged_tifs/wp2020_vulnerability_map_data.tif
2020-07-13 16:03:12  181040095 merged_tifs/wp_2020_1km.tif
2020-07-13 16:02:30 1239914347 merged_tifs/wp_2020_1km_data.tif
2020-07-13 16:04:09   21196695 merged_tifs/wp_2020_1km_urban_pop.tif
2020-07-13 16:03:45   64311161 merged_tifs/wp_2020_1km_urban_pop_data.tif

@bitner note the file sizes are now a lot smaller. Your files I moved into 2020-06-24 folder. The new *_data.tif are the float64 tiffs. The *.tif are the uint16 tifs, except lc.tif is unchanged in format, it is remains as byte values.

Also just noting I think your files may have been inflated because for me the find was matching many copies of the country tifs. I cleaned up my local copy of the s3 bucket so there should only be 1 copy of each country tiff (hence the smaller sizes even for the float64)

cc @pieschker @geohacker @bitner