microsoft / torchgeo

TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
https://www.osgeo.org/projects/torchgeo/
MIT License
2.33k stars 299 forks source link

Maxar imagery dataset #76

Open adamjstewart opened 2 years ago

adamjstewart commented 2 years ago

https://www.maxar.com/products/satellite-imagery

It seems like the imagery isn't free/open-source, but they do have samples we could use to write a data loader: https://resources.maxar.com/product-samples

calebrob6 commented 2 years ago

For anyone who works on this:

A good starting point is the System-Ready Stereo (1B), 8-band bundle, 50 cm | Rio de Janeiro, Brazil. The zip file download contains a "normal" Maxar scene with a directory structure as following

I also think the dataset object should parse the XML files in the _PAN and _MUL subdirectories to get information about the scene (off nadir angle, processing level, estimated cloud coverage, etc.).

Finally, it doesn't look like the TIFs are tiled by default, which will make windowed reading extremely slow. Users should be warned to convert the TIFs to COGs before making a dataset with them (e.g. if they create the Dataset with non-tiled TIFs maybe we should throw a warning).

adamjstewart commented 2 years ago

We may want to add a warning message for any raster file that isn't a COG, that should be easy to do. Is there a similar cloud-optimized file format for vector files, or are shapefiles the best we can do?