In your paper you have defined a particular loss function in equation (4). I wanted to know where this loss function is derived from? Is there a source for it or is it your contribution?
you're calculating the logarithm after finding the maximum not on the predictions which is shown as log(q) in equation(4). Is there a particular reason for this? I would appreciate more elaboration
In your paper you have defined a particular loss function in equation (4). I wanted to know where this loss function is derived from? Is there a source for it or is it your contribution?
Also when you're implementing this in your code: https://github.com/Microsoft/FERPlus/blob/master/src/train.py#L44
you're calculating the logarithm after finding the maximum not on the predictions which is shown as log(q) in equation(4). Is there a particular reason for this? I would appreciate more elaboration