Raschka-research-group / coral-cnn

Rank Consistent Ordinal Regression for Neural Networks with Application to Age Estimation
https://www.sciencedirect.com/science/article/pii/S016786552030413X
MIT License
341 stars 62 forks source link

Loss function is different from the article #9

Closed TZZZZ closed 5 years ago

TZZZZ commented 5 years ago

Am I right, that loss function in code is not the same that is described in the article (page 3, (4))? Why?

In file ./model-code/resnet34/cacd-coral.py:

def cost_fn(logits, levels, imp):
    val = (-torch.sum((F.log_softmax(logits, dim=2)[:, :, 1] * levels
                      + F.log_softmax(logits, dim=2)[:, :, 0]*(1-levels)) * imp, dim=1))
   return torch.mean(val)

In file ./model-code/resnet34/afad-coral.py:

def cost_fn(logits, levels, imp):
    val = (-torch.sum((F.logsigmoid(logits) * levels
                      + (F.logsigmoid(logits) - logits)*(1-levels)) * imp,
           dim=1))
    return torch.mean(val)

Why not F.logsigmoid(1 - logits) instead of (F.logsigmoid(logits) - logits)?

rasbt commented 5 years ago

You are right, good catch. Looks like there was an issue when I renamed the files when preparing this repo from my code files. I.e., the file

 ./model-code/resnet34/cacd-coral.py:

should be

 ./model-code/resnet34/cacd-ordinal.py:

Will fix that.

rasbt commented 5 years ago

should be fixed now. thanks!

TZZZZ commented 5 years ago

Thanks, but I still have question.

Now code in cacd-coral.py is

def cost_fn(logits, levels, imp):
    val = (-torch.sum((F.logsigmoid(logits)*levels
                      + (F.logsigmoid(logits) - logits)*(1-levels))*imp,
           dim=1))
    return torch.mean(val)

Why in the second summand we have (F.logsigmoid(logits) - logits) but not (F.logsigmoid(1 - logits)?

rasbt commented 5 years ago

They should be equivalent. I remember we were concerned about the numerical stability, which is why we wrote it this way.

Screenshot 2019-10-10 11 21 06
TZZZZ commented 5 years ago

Thank you, it is clear now.