Why did you use tensorflow probability to determine hard_neg_mask in your focal loss?

tjtanaa commented 4 years ago

Hi. This is the snippet for your focal loss.

 def focal_loss(self, y_true: tf.Tensor, y_pred: tf.Tensor):      

        self.mask = tf.equal(y_true, 1)

        cross_entropy = K.binary_crossentropy(y_true, y_pred)

        p_t = y_true * y_pred + (tf.subtract(1.0, y_true) * tf.subtract(1.0, y_pred))

        gamma_factor = tf.pow(1.0 - p_t, self.gamma)

        alpha_factor = y_true * self.alpha + (1.0 - y_true) * (1.0 - self.alpha)

        focal_loss = gamma_factor * alpha_factor * cross_entropy

        neg_mask = tf.equal(y_true, 0)
        thr = tfp.stats.percentile(tf.boolean_mask(focal_loss, neg_mask), 90.)
        hard_neg_mask = tf.greater(focal_loss, thr)
        # mask = tf.logical_or(tf.equal(y_true, 0), tf.equal(y_true, 1))
        mask = tf.logical_or(self.mask, tf.logical_and(neg_mask, hard_neg_mask))
        masked_loss = tf.boolean_mask(focal_loss, mask)

Could I know why did you choose to compute the 90-percentile and compute the hard negative mask instead of using the original focal loss mask?

What is the performance gain that you obtained after making such changes?

nschein commented 4 years ago

Hi, are you asking a) why using negative samples at all or b) why just using the top 10percentile? a) negative examples help to prevent box detections where we don't want them b) I wanted to limit the influence of the negative examples, because I felt they were overpowering the correct examples, I have no information about the performance gain though.

tjtanaa commented 4 years ago

Thank you very much.

tyagi-iiitv / PointPillars

Why did you use tensorflow probability to determine hard_neg_mask in your focal loss? #24