Closed 3rd3 closed 7 years ago
Also you may need to clip the values for the log as it's done here. The base of the log does not matter, I think.
Thanks for the note – this confused me at first as well, since Ŷ usually is the prediction; however, you can see in the adversarial loss equations for both D and G that the prediction is passed as the first parameter to L_bce (Y) and the 1/0 label is passed as the second (Ŷ). This lines up with the fact that, for cross-entropy, the prediction is usually inside the log. I believe they mixed up the domains, (there are several other similarly-confusing typos in the paper that the authors clarified for me when I reached out to them.)
That also corresponds to the definition on Wikipedia. I should have looked there first. Thanks!
I think I've found another bug: In the BCE loss you calculate
-sum targets · log(preds) + (1 - targets) · log(1 - preds)
, whereas in the paper it is defined like this:Following the other notation in the paper, Ŷ appears to be the prediction and Y is the target, which also matches the real interval as domain for Ŷ. So it is the other way around in the paper. Is the BCE symmetric?