Possible problem is that since most samples of input data will not be classified as 'burned', our model won't have enough samples of burns to pick out the important features correctly. Instead the model will probably tend to classify most samples as non-burned as this will give the lowest loss, without the ability to pick out burn features. (relative abundance of burn not important since not probabilistic model)
Proportion of samples without any burn: 'no_burn_prop'
Rest of samples will make up 1-'no_burn_prop' proportion and must meet the following criteria:
a) Burn proportion > 'bp' (0 -> 1)
b) Water proportion < 'wp' (0 -> 1)
[x] Get sampler rejector for MODIS only and randomGeoSampler
[x] Get sampler rejector for MODIS only and batchRandomGeoSampler
[x] Figure out how this will work with intersection datasets
[x] Write some sampling tests
To be compatible with PyTorch this must be implemented as a new sampler or edited dataloader class. Currently, I think a new sampler within torchGeo is the way to go.
Possible problem is that since most samples of input data will not be classified as 'burned', our model won't have enough samples of burns to pick out the important features correctly. Instead the model will probably tend to classify most samples as non-burned as this will give the lowest loss, without the ability to pick out burn features. (relative abundance of burn not important since not probabilistic model)
As explained in this paper: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9480226 (page 364), we will make a sampler rejector that makes sure that batches of samples must satisfy specific criteria. To start with these will be:
To be compatible with PyTorch this must be implemented as a new sampler or edited dataloader class. Currently, I think a new sampler within torchGeo is the way to go.