stac-utils / stac-sentinel

STAC Collections and Items for Sentinel data
MIT License
7 stars 3 forks source link

Add sentinel-1 RTC STAC metadata generator #1

Closed scottyhq closed 3 years ago

scottyhq commented 3 years ago

@matthewhanson - I think this is ready to go, but I've only tested on a single acquisition. So I was thinking it might be good to generate a full static catalog with a couple acquisitions from a few grid squares.

this adds several dependencies (rasterio, geopandas, pystac), mainly because i think the best way to get metadata in this case is not reading a single .json file, but actually getting the metadata stored within .tif files.

i've added some documentation in the readme, would appreciate review at this point to figure out next steps!

scottyhq commented 3 years ago

@matthewhanson thoughts on getting this merged?

matthewhanson commented 3 years ago

Ah sorry @scottyhq , this feel off my radar along with most everything else when I took PTO at the end of the year.

All the metadata looks good, but my only concern is getting the metadata from the .tif files, because it's much slower to generate if the files are all remote. Or if they are in a requester pays bucket. So if we wanted to create STAC for the entire sentinel-1-rtc public dataset that's a lot of files that will have to be hit.

It sounds like there's some fields that can't be retrieved from just the metadata files? What fields are those? Does it make sense to have it be an option to get extended metadata from the tif files, or are their some required fields in there?

scottyhq commented 3 years ago

All the metadata looks good, but my only concern is getting the metadata from the .tif files, because it's much slower to generate if the files are all remote.

As far as I could tell there is no single existing metadata file here that simply needs converting to STAC. the only way to get required fields (like the bbox) is to open the tif. Just reading metadata is ~200ms from s3.

fortunately the bucket isn't requester pays... currently at least (https://registry.opendata.aws/sentinel-1-rtc-indigo/). it's just CONUS, but i don't see an inventory file. A back of the envelope calculation suggests ~ 228,390 Items (933 grid squares, 230 images to-date), growing at a rate of ~1000 Items per week (each with 3 .tif assets).

also could ping someone at indigo-ag but i don't see any activity in their github org related to the rtc dataset...

scottyhq commented 3 years ago

@matthewhanson thoughts on merging this and trying things out? would be really nice to have a public SAR STAC to test 1.0.0rc1

matthewhanson commented 3 years ago

@scottyhq I'm going ahead and merging this. However, now the preferred way of doing these sorts of mappings is to add a submodule to stactools. Would be good to eventually migrate the S1 stuff there.