Loss calculated on wrong dimension

Hi,

When you've written the loss functions, did you calculate with batch data? E.g. reading your tversky loss implementation, I think you've accidentally summed up everything aggregated over the batch instead of the data points.

    true_pos = K.sum(y_true_pos * y_pred_pos)
    false_neg = K.sum(y_true_pos * (1-y_pred_pos))
    false_pos = K.sum((1-y_true_pos)*y_pred_pos)

When you do this, you sum up every prediction, but you actually want to calculate the loss for each image input. Let me suggest the following:

def tversky():
    y_true_pos = tf.reshape(y_true, (batch_size, -1))
    # [...]
    true_pos = K.sum(y_true_pos * y_pred_pos, axis=1)
    # [...]

def tversky_loss():
    return K.sum(1 - tversky())

For this reason your focal loss should make no difference to the simple tversky loss, because you raised the aggregated sum to the power of gamma. Instead, I believe you should raise the outputs to the power of gamma before the sum.

Of course if somehow your loss is calculated per a sample, than everything looks fine and keras made an automatic aggregation later, then everything looks fine.

nabsabraham / focal-tversky-unet

Loss calculated on wrong dimension #30