Closed glhfgg1024 closed 7 years ago
@glhfgg1024 $U_i(l)$ in the paper does denote negative unary energy. Note that "energy" in the paper means negative log probability (low energy => high probability). In the code, "inputs" in the line you linked is the output of the FCN network. Therefore, it's already a probability (negative energy). That's why we don't need to negate it again.
EDIT: In the last line I should have mentioned 'log-probability' instead of 'probability'
Hi @sadeepj, thanks a lot for your kind reply and explanations!
But maybe I misunderstood something. For example, in your code, https://github.com/sadeepj/crfasrnn_keras/blob/master/crfrnn_model.py#L103, the unary given as inputs to the CrfRnnLayer
is still the logits, right? The upscore
has not been converted to probability
before they are fed into the CrfRnnLayer
. As I understood (maybe wrong), the upscore
should be first converted to probabilities using tf.nn.softmax
function, then conduct log
operation to get the \phi_u(x_i)
which is the unary costs, and then get its negative as U_l(x_i) = \phi_u(x_i=l)
. If you are convenient, could you please help clarify what I misunderstood?
@glhfgg1024
Sorry for taking so long to answer! I've been a bit busy.
As mentioned in the last line, column 1, page 4 of https://arxiv.org/pdf/1502.03240.pdf, U_i(l) = - \psi_u(x_i=l)
(note the negative sign). Basically, U_i(l)
values are in the 'log-probability' or 'negative energy' domain (higher value => higher probability). Now, as you mentioned, ideally we should have done a softmax()
on upscore
and then taken log()
of that. But since softmax()
does an exp()
operation, softmax()
followed by log()
is kind of redundant (it becomes a computational burden). In other words upscore
is already in the 'log-probability' domain. I understand that log()
followed by softmax()
is not the same as doing nothing - but it doesn't really matter as we learn the optimal CRF parameters anyway (these parameters decide how to weigh the different inputs during inference).
Hi @sadeepj, thanks very much for your kind answers!
Hi, Mr. Jayasumana, thanks a lot for sharing your valuable code!
I have a question, in https://github.com/sadeepj/crfasrnn_keras/blob/master/crfrnn_layer.py#L76, do we need to first convert the probabilities to negative? Because in the paper https://arxiv.org/pdf/1502.03240.pdf, page 4, left-bottom, it says "we use $U_i(l)$ to denote the negative of the unary energy".