Open abarciauskas-bgse opened 1 year ago
@abarciauskas-bgse 👋
You could use rio-tiler Mosaic
feature to fetch
all image and then merge them into one image
and then get the statistics from it 👇
import numpy
from rio_tiler.io import Reader
from rio_tiler.mosaic import mosaic_reader
from rio_tiler.models import ImageData
mosaic_url = match.groups()[0]
str_bounds = f"{bounds[0]},{bounds[1]},{bounds[2]},{bounds[3]}"
assets_endpoint = f"{titiler_url}/mosaicjson/{str_bounds}/assets?url={mosaic_url}/mosaicjson"
# create a dataset from multiple COGs?
assets_response = requests.get(assets_endpoint)
datasets = []
assets = assets_response.json()
# see https://github.com/cogeotiff/rio-tiler/blob/main/rio_tiler/io/rasterio.py#L368-L380
def _part_read(src_path: str, *args, **kwargs) -> ImageData:
with Reader(src_path) as src:
return src.part(bbox, *args, **kwargs)
# mosaic_reader will use multithreading to distribute the image fetching
# and them merge all arrays together
img, _ = mosaic_reader(assets, bounds, max_size=512). # change the max_size to make it faster/slower
data = img.as_masked() # create Masked Array from ImageData
# Avoid non masked nan/inf values
numpy.ma.fix_invalid(data, copy=False)
hist = {}
for ii, b in enumerate(img.count):
h_counts, h_keys = numpy.histogram(data[b].compressed())
hist[f"b{ii + 1}"] = [h_counts.tolist(), h_keys.tolist()]
☝️ This will not work if the image do not have the same resolution (because we won't be able to overlay them). if you know the resolution you want to use you can use width=.., height=..
instead of max_size=512
(it will ensure you create the same array size for all the images.
I'm also assuming that the assets
are COG and not STAC Items (if not then you have to use STACReader and pass a list of stac assets
to read.
@wildintellect mentioned that titiler supports returning numpy arrays and we were wondering if it would be possible to call the mosaics endpoint with the bounding box and get back a numpy array.
Yes, TiTiler can return numpy array https://developmentseed.org/titiler/examples/notebooks/Working_with_NumpyTile/
You can use the /crop
endpoint for each file https://titiler.maap-project.org/docs#/SpatioTemporal%20Asset%20Catalog/part_stac_crop__minx___miny___maxx___maxy___format__get
for asset in assets_response.json():
try:
crop_endpoint = f"{titiler_url}/cog/crop/{str_bounds}.npy?url={asset}&max_size=512" # Same here you can either use max_size or width&height
res = requests.get(crop_endpoint)
arr = numpy.load(BytesIO(r.content))
tile, mask = arr[0:-1], arr[-1]
# TODO:
# convert tile/mask to xarray dataset
datasets.happen(ds)
except:
pass
ds = xr.combine_by_coords(datasets, fill_value=datasets[0]._FillValue)
Thanks @vincentsarago will take a closer look at this solution.
To create a histogram from a mosaic, I am using the titiler mosaic assets endpoint to discover which assets fall into the current bounding box and then use those to clip and combine datasets: https://github.com/developmentseed/stac_ipyleaflet/blob/main/stac_ipyleaflet/core.py#L127
Is there a better way to do this? @wildintellect mentioned that titiler supports returning numpy arrays and we were wondering if it would be possible to call the mosaics endpoint with the bounding box and get back a numpy array. This looks possible via https://titiler.maap-project.org/docs#/SpatioTemporal%20Asset%20Catalog/tile_stac_tiles__TileMatrixSetId___z___x___y___format__get using
npy
as the format, but what should XYZ be? It doesn't look there is anyway to request the data just for a bounding box either. If I had to guess, I would imagine that we need first determine which (numpy) tiles to request based on the bounding box (giving us the x,y) and then clip those tiles to the bounding box. And then the zoom may depend on what level of accuracy we require for the histogram.@vincentsarago any suggestions?