Closed Bonsen closed 5 years ago
How did you solve this?
@DecentMakeover @Bonsen @vandit15 If am not wrong, it is the samples_per_cls for the whole dataset. This is so that we can use the n_y as a power for beta to calculate the effective number.
@rahulvigneswaran So do you mean that if your dataset looked like this: Class 0: 100 images Class 1: 200 images Class 2: 60 images
You would always set samples_per_cls to [100,200,60]
?
I should compute samples_per_cls of whole dataset or each batch? If there is 0 in samples_per_cls of each batch, the loss will be nan.