Open MaratKhabibullin opened 3 years ago
Can you please share the implementation of this feature (or at least some paper). Because google shows that this feature is implemented differently. E.g. I've found an implementation (also OHEM) which just back propagates only top-K losses.
ps plz, fix your formula, I just don't get what SumOfPositiveClassLosses / + alpha
means...
Please add online hard example mining (OHEM) based on BCE, L1, L2 and other losses for binary classification.
Loss = SumOfPositiveClassLosses / + alpha SumOfNegativeClassLosses / + beta SumOfHARDNegativeClassLosses / <number of hard negative class examples>
hard negatives
are top K negative class elements with highest loss value.alpha
,beta
,K
are external parameters.