Because the classes has-edge and no-edge have a very high imbalance (class has-edge is less than one percent), the train data has to be sampled.
First I tried to use a self implemented padding (with the keras-model) where I duplicate edges until both classes have the same size. This didn't seem to improve the result much.
Now (with the pytorch-model) I'm using the WeightedRandomSampler from torch.utils.data. Instead of just adding data points, it samples the original dataset with random data points with weights. The weights I provide ensure that the sampled dataset is expected to have the same classes size.
This did seem to improve the result quite a bit. But no direct comparison was made.
Because the classes
has-edge
andno-edge
have a very high imbalance (classhas-edge
is less than one percent), the train data has to be sampled.First I tried to use a self implemented padding (with the keras-model) where I duplicate edges until both classes have the same size. This didn't seem to improve the result much.
Now (with the pytorch-model) I'm using the
WeightedRandomSampler
fromtorch.utils.data
. Instead of just adding data points, it samples the original dataset with random data points with weights. The weights I provide ensure that the sampled dataset is expected to have the same classes size. This did seem to improve the result quite a bit. But no direct comparison was made.