developmentseed / public-datasets

[WIP] End to End solution to Create and Deploy a STAC API for public datasets
MIT License
5 stars 1 forks source link

Creating Landsat8 items #2

Open vincentsarago opened 3 years ago

vincentsarago commented 3 years ago

As a minimal input we have to create the scene geometry using the landsat WRS geometry

Data

USGS: https://www.usgs.gov/core-science-systems/nli/landsat/bulk-metadata-service

Links

vincentsarago commented 3 years ago

sample: https://gist.github.com/vincentsarago/d5091ff44ea6a0a405c3e486c69ffbf4 code: https://github.com/developmentseed/public-datasets/blob/main/public_datasets/feeder/public_datasets/feeder/landsat/aws.py#L198-L302

@sharkinsspatial @geospatial-jeff could you give this a 👀 🙏

geospatial-jeff commented 3 years ago

lgtm, csv.DictReader might make parsing the csv easier https://docs.python.org/3/library/csv.html#csv.DictReader. Also I'd recommend lowering the precision of the various float types throughout the item.

kylebarron commented 3 years ago

I'd personally use Pandas in this situation, and I think it would be faster to use the Pandas chunked CSV reader, but maybe that's too large of a dependency? It might be more relevant for Sentinel because there's more to do

vincentsarago commented 3 years ago

yeah if we can avoind any fancy dependency it will be better. We don't really need pandas for the Landsat creation because we only need to process one line at a time and don't really need to do operation on a full dataset... which might be the case for Sentinel!

Edit/Note: right now we have rasterio deps but it could easily go!

kylebarron commented 3 years ago

Given that creating items is mostly a one-time pipeline, I don't have any aversion to fancy dependencies if it makes our life easier