Open wwoodgate opened 3 weeks ago
Hey @wwoodgate! Yep, it would be really neat to provide easier access to that data - it's pretty incredible stuff! Unfortunately a challenge is that the data doesn't currently have STAC metadata or a STAC search API, which means that the approach used in the Planetary Computer notebook (using Planetery Computer's STAC API to find and load the data for an AOI) doesn't directly apply.
It would be possible to develop a funky method that uses Meta's custom index file as a lookup - e.g. work out what file paths to load for an AOI. It would be a bit more bespoke, but probably pretty useful given how enormous this data is! Will have a think - and please feel free to share anything you put together yourself (we could try and get it into a notebook here)!
@wwoodgate Here's a possible approach - use geopandas
to load the index tile file for a bounding box, then load the corresponding tile and clip to your AOI using odc.geo
:
import os
import rioxarray
import geopandas as gpd
import odc.geo.xr
from odc.geo.geom import BoundingBox
# Allow anonymous access to Meta canopy height data on Amazon S3
os.environ["AWS_NO_SIGN_REQUEST"] = "YES"
# Set study area
bbox = BoundingBox.from_xy(
x=(152.436658, 152.453911),
y=(-32.387423, -32.402873),
crs="EPSG:4326",
)
# Read index tiles within bbox
tiles_gdf = gpd.read_file(
"s3://dataforgood-fb-data/forests/v1/alsgedi_global_v6_float/tiles.geojson",
bbox=bbox,
)
# Load specific tile
tile_id = tiles_gdf.tile[0]
ds = rioxarray.open_rasterio(f"s3://dataforgood-fb-data/forests/v1/alsgedi_global_v6_float/chm/{tile_id}.tif").squeeze()
# Crop to extent of bbox
ds_cropped = ds.odc.crop(bbox.polygon)
# Set zero values to NaN for visualisation
ds_cropped = ds_cropped.where(ds_cropped > 0)
# Export to file
ds_cropped.odc.write_cog(f"meta_chm_{tide_id}.tif")
# Explore on map
ds_cropped.odc.explore(cmap="Greens", vmin=0, vmax=15)
Looks great! Thanks very much @robbibt
One minor hiccup was an error was thrown when reading in the gpd. The error was "CPLE_AWSAccessDeniedError: Access Denied" so perhaps something to do with the s3 mount?
I changed the path to ""https://dataforgood-fb-data.s3.amazonaws.com/forests/v1/alsgedi_global_v6_float/tiles.geojson" and it worked nicely.
Thanks a bunch for this!!
Setting this should have in theory given you access:
os.environ["AWS_NO_SIGN_REQUEST"] = "YES"
But yep, access via HTTPS should work as a workaround.
I was messing around with a notebook example for this (comparing cover from DEA Fractional Cover with height from CHM), so will post it here when ready!
Yup adding that fixes the issue.
Looking forward to the cover vs height comparison!
Hi DEA team,
Building on the fantastic MS Planetary Computer notebook functionality that enables users to gain access to the MS Data Catalogue, I was wondering if similar functionality could be added to load in Open (spatial) Data from AWS? I'm specifically thinking of the recent 1m High Resolution (global) Canopy Height Maps by WRI and Meta (https://registry.opendata.aws/dataforgood-fb-forests/).
I had a poke around the files but couldn't see an easy way to do a spatial query (on say a specified bbox).
Just a suggestion!
Cheers