NRCan / geo-deep-learning

Deep learning applied to georeferenced datasets
https://geo-deep-learning.readthedocs.io/en/latest/
MIT License
150 stars 49 forks source link

FEATURE: read only valid portion of imagery #538

Open remtav opened 1 year ago

remtav commented 1 year ago

Is your feature request related to a problem? Please describe. Problem: some mosaics we use consists mostly of nodata regions. For example, NB_Edmunstron_2016-B: image

This 15 cm image covers approx. 24 x 19.5 km -> 468 km2 including nodata (5200 km2 in 50 cm equivalent). The file for each indivial band itself is over 28Gb. However, the valid portion of this image covers appox. 10-15 % of the entire image.

How can we optimize this?

@mpelchat04 had brought up this issue already... The only solution we could come up with was creating smaller mosaics.

Describe the solution you'd like How can nodata be supported along the way, in the dataloader for example? Anywhere else? Any other ideas?

remtav commented 10 months ago

Update: this remains a major optimization issue for our extraction pipeline :)

Other examples where valid data is less than 75%: image image image image image image image image