train loss is always 0, is this a bug? - Githubissues

DAMO-DI-ML / KDD2023-DCdetector

185 stars 21 forks source link

train loss is always 0, is this a bug? #5

Open zhangbububu opened 1 year ago

zhangbububu commented 1 year ago

line 168 in solve.py:

loss = prior_loss - series_loss

train loss is always 0, is this a bug?

maheshyadav007 commented 1 year ago

line 168 in solve.py:

loss = prior_loss - series_loss

train loss is always 0, is this a bug?

This is happening for me too. I think line 168 will be ... loss = prior_loss + series_loss

yyysjz1997 commented 1 year ago

line 168 in solve.py: loss = prior_loss - series_loss train loss is always 0, is this a bug?

This is happening for me too. I think line 168 will be ... loss = prior_loss + series_loss

Thanks for pointing that out, it's not a bug. You can replace this with "+" to see what effect it has on the results, especially for datasets with low anomaly ratios.

tianzhou2011 commented 1 year ago

Thank you all for bringing this to our attention. We must admit that we overlooked this aspect initially. Upon further investigation, it does seem to yield a zero. However, when we conducted the experiment for MSL and changed the loss to 0(prior_loss - series_loss), We observed a significant decrease of over 10% in F1 performance. Hence, the actual training loss is not zero but rather a very small number. Or It is possible that our understanding of Torch is not as comprehensive as it should be. As an example, if you maintain the loss as prior_loss - series_loss and print out the values of prior_loss and series_loss during training, you will observe that they are updating. However, if you set the loss as 0(prior_loss - series_loss), both losses will remain still.

tianzhou2011 commented 1 year ago

An updated response reveals that the value is actually precisely zero. However, the effectiveness of the training is attributed to the distinct stop gradients we assign. It must be acknowledged that this aspect went unnoticed during our research, as we solely focused on monitoring the test F1 metrics. We sincerely appreciate all of you and other researchers who noticed that for bringing this to our attention, as it is indeed an intriguing process.

herozen97 commented 5 months ago

Why is the loss not "prior_loss + series_loss" as depicted in equation (9) from the paper?

herozen97 commented 5 months ago

Another question is, my "prior_loss" is always equal to "series_loss".