dataset length question

vicoslab / mixed-segdec-net-comind2021

Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Other

292 stars 89 forks source link

dataset length question #15

Open nixczhou opened 3 years ago

nixczhou commented 3 years ago

I wonder why in every input_ksdd2.py input_ksdd.py input_dagm,py or other dataset, the len of the dataset should be 2*len(pos_samples) for training, why not len(pos_samples) + len(neg_samples)?

self.len = 2 * len(pos_samples) if self.kind in ['TRAIN'] else len(pos_samples) + len(neg_samples)

Thank you :)

JakobBozic commented 3 years ago

Hi, this is because we use 1-1 undersampling of negative(defect free) samples, to avoid issues with unbalanced training set. Every epoch consists of training on all positive samples and the same number of negative samples, which are randomly selected.

nixczhou commented 3 years ago

Oh i see, in my case i happened to have more positive samples, so i have an error when training with it. So, do you think i must change the length?