Adding On-demand Training Data Notebook

KennSmithDS commented 2 years ago

PR to merge the notebook tutorial for creating on-demand training data from the Planetary Computer data catalog when starting from a Radiant MLHub dataset.

KennSmithDS commented 2 years ago

@TomAugspurger do you know why the stackstac.stack function will add a buffer in the ndarrays returned? Does it have to do with how it reprojects the image data when it's cached?

For example in this block of code, I'm fetching the Sentinel-2 source imagery from the Azure Blob storage for our MLHub. We know the chips all to be 120x120 pixels, but the stack object dimensions vary from 122x122 up to 130x130.

s2_stack = stack( items=ItemCollection([source_item]), assets=BIGEARTHNET_RGB_BANDS, epsg=rio.open(get_redirect_url(source_item.assets["B02"])).crs.to_epsg(), resolution=10, )

P.S. sorry I don't know how you're doing the cool Jupyter Notebook integration.

KennSmithDS commented 2 years ago

Closing in favor of #171 due to rebasing issue

microsoft / PlanetaryComputerExamples

Adding On-demand Training Data Notebook #162