NASA-IMPACT / veda-data-pipelines

data transformation - ingestion - publication pipelines to support VEDA
Other
13 stars 6 forks source link

Fix bounding in EPA COG assets #196

Closed anayeaye closed 1 year ago

anayeaye commented 2 years ago

What

EPA assets cannot be zoomed > 0 in the dashboard. The fix is to recreate the COG assets with appropriate overviews.

blurry-epa

EPA COG info snippet

rio cogeo info /vsis3/veda-data-store-staging/EIS/cog/EPA-inventory-2012/monthly/EPA-monthly-emissions_1B2b_Natural_Gas_Production_201208.tif

<SNIP>
Image Structure
    COMPRESSION: LZW
    INTERLEAVE: BAND
    LAYOUT: COG

IFD
    Id      Size           BlockSize     Decimation           
    0       700x350        512x512       0
    1       350x175        512x512       2

LIS TWS COG info snippet for comparison w/overviews

% rio cogeo info /vsis3/veda-data-store-staging/EIS/COG/LIS_TWS_ANOMALY/Anomaly_TWS_20020923.cog.tif 
<SNIP>

Image Metadata
    AREA_OR_POINT: Area
    OVR_RESAMPLING_ALG: NEAREST

Image Structure
    COMPRESSION: DEFLATE
    INTERLEAVE: BAND
    LAYOUT: COG

IFD
    Id      Size           BlockSize     Decimation           
    0       3600x1500      512x512       0
    1       1800x750       128x128       2
    2       900x375        128x128       4

AC

xhagrg commented 2 years ago

@anayeaye if we reprocess the images and upload with the same filename to the bucket, do we even need to re-inges into STAC? Wondering if we even need to remove and update stac.

anayeaye commented 2 years ago

@xhagrg True, if we're confident about the new COGs replacing them in place would work without updating the STAC records. If it is an easy/safe operation we wouldn't need to update STAC or delta-config. Maybe add some sort of thorough dashboard QA AC step instead?

xhagrg commented 2 years ago

Yes, when i get the reprocessing done, will send it in the veda-dashboard channel and ask for review/tests in the dashboard?

xhagrg commented 2 years ago
image (6)

the cogs look okay in qgis.

xhagrg commented 2 years ago

@anayeaye updated the overviews and uploaded, still the same behavior.

image
xhagrg commented 2 years ago

it might be because of the min max zoom levels

anayeaye commented 2 years ago

At zooms > 0 for EPA emissions datasets such as Forest Fires (daily), we are getting 'Method returned empty array' errors from the tiler. The COGs are very small and it initially seemed like adding overviews and/or increasing the file size/decreasing pixel resolution could help but had no luck. For example the Abandoned Coal Mines COG was updated (see notes in this issue) but even with additional overviews and massive resizing we are not able to get tiles for higher zooms.

This seems like either a COG metadata problem or possibly some sort of caching in the tiler--or maybe even our map layer parameters? @vincentsarago do you see what might be going on here? Maybe we can tag up or look at the issue async sometime next week.

Example request https://staging-raster.delta-backend.com/mosaic/tiles/b61930a098d093f160ca0c5e4d28f7c6/WebMercatorQuad/5/7/12@1x?assets=cog_default&colormap_name=rainbow&rescale=0,1690773099642&nodata=0.

STAC Collection https://staging-stac.delta-backend.com/collections/EPA-daily-emissions_5_Forest_Fires

Sample unmodified COG copied to public bucket s3://covid-eo-data/20220923-epa-sample/EPA-daily-emissions_5_Forest_Fires_20120101.tif

Sample reprocessed COG copied to public bucket s3://covid-eo-data/20220923-epa-sample/EPA-annual-emissions_1B1a_Abandoned_Coal.tif

vincentsarago commented 2 years ago
$ rio cogeo info EPA-daily-emissions_5_Forest_Fires_20120101.tif
...
Geo
    Crs:              EPSG:4326
    Origin:           (-129.99999694660497, 19.999999240339655)
    Resolution:       (0.09999999672558174, 0.0999999934417812)
    BoundingBox:      (-129.99999694660497, 54.99999694496308, -59.999999238697754, 19.999999240339655)
    MinZoom:          3
    MaxZoom:          4
...
BoundingBox:      (-129.99999694660497, 54.99999694496308, -59.999999238697754, 19.999999240339655)

BoundingBox is really weird because it should be -129.99999694660497, 19.999999240339655, -59.999999238697754, 54.99999694496308,

This issue then is that rio-tiler will always return TileOutsideBounds (en thus empty array) for this file because the geographic bounds is wrong!

anayeaye commented 2 years ago

Thanks @vincentsarago! I think this is the original processing script. I think we now have access to the raw data, so it sounds like we just need to try revisiting the nc to COG processing code check our bounds outputs until we get them right.

xhagrg commented 2 years ago

Will reprocess it and see if that works.

xhagrg commented 2 years ago

All of the EPA data is re-processed and currently loading fine.

Steps taken:

  1. Update bounds to be in proper format.
  2. Add CRS (when missing)
  3. 25x the tiff file (using nearest neighbor) to get "higher" resolution.
anayeaye commented 2 years ago

This issue is ready for QA: All of the COG assets for the EPA-* datasets in the EIS thematic area of the dashboard were reprocessed to solve a problem in which users could not zoom in on the map (the data became blurry at all zooms greater than 0). We need to make sure that all datasets are still viewable in the map and that users can zoom in without seeing a blurry map. cc: @gmarichalgxc @Catalina-Moller @mmaniceraGXC

Catalina-Moller commented 2 years ago

Only the EPA - Other - Forest Fire (Daily) Dataset is blurry when the user zooms in.

Evidence: imagen_2022-10-19_100319150

Url: https://www.earthdata.nasa.gov/dashboard/eis/datasets/epa-other/explore?position=-91.2387%7C36.5487%7C3.54&datetime=2012-01-01T00%3A00%3A00.000Z&layer=epa-daily-emissions_5_forest_fires

All other EPA-* datasets in the EIS thematic area of the dashboard are still visible on the map and users can zoom in without seeing a blurry map.

cc: @anayeaye

xhagrg commented 2 years ago

https://www.earthdata.nasa.gov/dashboard/eis/datasets/epa-other/explore?datetime=2012-01-01T00%3A00%3A00.000Z&layer=epa-daily-emissions_5_forest_fires&position=-92.3416%7C32.5267%7C4.22

This should now be fixed.

j08lue commented 2 years ago

They are not blurry anymore, but should the data really look like this - blobs with some kind of interpolation towards the center?

image

https://www.earthdata.nasa.gov/dashboard/eis/datasets/epa-other/explore?datetime=2012-01-01T00%3A00%3A00.000Z&layer=epa-daily-emissions_5_forest_fires&position=-96.0397%7C36.1956%7C8.93

Do we know who the data provider was, then I can double-check with them?

xhagrg commented 2 years ago

The data was downloaded from https://www.epa.gov/ghgemissions/gridded-2012-methane-emissions#data. Original data is very coarse and we had to reprocess and upscale the images.

Catalina-Moller commented 2 years ago

The EPA Datasets are rendering as expected. But we have found one possible issue that has a low priority, this is: NASA-IMPACT/veda-data#68