Negative loss values - complete training breakdown

ABotond commented 10 months ago

Hi!

I was trying your T-Loss function and run into a blocking issue: I am using images of size 512x512, and the input_tensor is a sigmoid activated output of my segmentation network, the target is a binary mask.

In the very beginning of the training I get huge loss value. However, after a few hundred iterations (~2 epochs) the loss value becomes negative (in the -10000 -> -100000 range). I've tried it with the default nu = 1.0 initialization and with nu = 0.0 - but the results are the same. During these few iterations self.nu did not change.

Here is an example for the 6 terms of the total loss (in order): [-1413416.0, 0.5723649859428406, -131072.0, 150042.03125, 0.0013107199920341372, 1312091.375]

And total_losses.mean() is -82353.

Am I doing something wrong (like the activation), or does this function not work with larger images, or is there a general issue?

skeletonNN commented 5 months ago

Do you solve it?

ABotond commented 5 months ago

@skeletonNN Unfortunately I didn't. I remember I played around with it quite a while (but I don't remember anymore what I did), then I gave up using it...

alvarogonjim commented 5 months ago

Hey @skeletonNN and @ABotond, I'm not sure about your specific scenarios, but in my experiments the loss values tend to be negative and have large values due to the exponential operation. The crucial aspect here is to ensure that the loss consistently decreases over time and eventually converges.

Another point is that since we're minimizing the negative log-likelihood images with clean masks tend to have significantly lower loss values compared to those with noisy masks (affine transformation, dilatation, or erosion), where the loss values tend to be higher.

CosmoWood commented 5 months ago

Hey @skeletonNN and @ABotond, I'm not sure about your specific scenarios, but in my experiments the loss values tend to be negative and have large values due to the exponential operation. The crucial aspect here is to ensure that the loss consistently decreases over time and eventually converges.

Another point is that since we're minimizing the negative log-likelihood images with clean masks tend to have significantly lower loss values compared to those with noisy masks (affine transformation, dilatation, or erosion), where the loss values tend to be higher.

So negative is normal right?You mentioned 'ensure that the loss consistently decreases over time and eventually converges.' When it is negative,does it should be closer to zero?

alvarogonjim commented 4 months ago

Hey @skeletonNN and @ABotond, I'm not sure about your specific scenarios, but in my experiments the loss values tend to be negative and have large values due to the exponential operation. The crucial aspect here is to ensure that the loss consistently decreases over time and eventually converges. Another point is that since we're minimizing the negative log-likelihood images with clean masks tend to have significantly lower loss values compared to those with noisy masks (affine transformation, dilatation, or erosion), where the loss values tend to be higher.

So negative is normal right?You mentioned 'ensure that the loss consistently decreases over time and eventually converges.' When it is negative,does it should be closer to zero?

@CosmoWood Indeed, it's normal for the values to be negative when minimizing the loss function. When the loss value reaches zero, it may still continue to decrease and become negative. This could indicate that the model is continuing to improve its performance. However, if the loss value is already negative, it should not reach zero; instead, it will simply keep decreasing

Digital-Dermatology / t-loss

Negative loss values - complete training breakdown #4