Closed pjhartzell closed 1 year ago
I'm implementing option 1 from above. I'll add a from_numpy_array
alternative constructor as an assist for those cases where you want to build a footprint from an existing data array.
Performance note: For a single band ESA Worldcover raster, which is quite large at 36000x36000 pixels, using rasterio's mask reading takes about 1.3 seconds. Using the existing numpy-based mask generator takes about 45 seconds.
I've found that using rasterio masks is much faster (an order of magnitude plus) than the
data_mask
class method. I think we should default to using rasterio'sread_masks
method.A few possible ways to make this change:
data_array
argument would be replaced with something likemask_array
. This is clean, but requires users to build their own mask array if they are directly using the class constructor. We can retain the existing mask creation logic as a free function utility for this purpose, but we are asking users to do more work in certain cases.is_mask
flag to instruct the class to bypass mask creation (anyno_data
values would be ignored) and use thedata_array
argument as the mask.I'm open to other ideas.