Rob174 / detection_nappe_hydrocarbures_IMT_cefrem

0 stars 0 forks source link

Balance the dataset #26

Closed Rob174 closed 3 years ago

Rob174 commented 3 years ago

We want to provide a globally equal number of images to the model with seep and spill than with nothing on the patch.

Problem: to provide data to the model we iterate over a predefined ensemble of images.

One easy but impracticable solution: Add a file that stores the classes of each patch. Problem: fix the patches:

Rob174 commented 3 years ago

We can observe that a majority of patches contain only the other class.

We could create an object that keep track of how many classes have been seen of each class and reject (using the parameter of this name) patches of class other if there are too many of them and add them again once sufficient number of seep or spills has been seen

Rob174 commented 3 years ago

Working at f4290b7