Binary Coss-Entropy loss

dyelax / Adversarial_Video_Generation

A TensorFlow Implementation of "Deep Multi-Scale Video Prediction Beyond Mean Square Error" by Mathieu, Couprie & LeCun.

MIT License

734 stars 184 forks source link

Binary Coss-Entropy loss #3

Closed 3rd3 closed 7 years ago

3rd3 commented 7 years ago

I think I've found another bug: In the BCE loss you calculate -sum targets · log(preds) + (1 - targets) · log(1 - preds), whereas in the paper it is defined like this:

BCE loss

Following the other notation in the paper, Ŷ appears to be the prediction and Y is the target, which also matches the real interval as domain for Ŷ. So it is the other way around in the paper. Is the BCE symmetric?

3rd3 commented 7 years ago

Also you may need to clip the values for the log as it's done here. The base of the log does not matter, I think.

dyelax commented 7 years ago

Thanks for the note – this confused me at first as well, since Ŷ usually is the prediction; however, you can see in the adversarial loss equations for both D and G that the prediction is passed as the first parameter to L_bce (Y) and the 1/0 label is passed as the second (Ŷ). This lines up with the fact that, for cross-entropy, the prediction is usually inside the log. I believe they mixed up the domains, (there are several other similarly-confusing typos in the paper that the authors clarified for me when I reached out to them.)

3rd3 commented 7 years ago

That also corresponds to the definition on Wikipedia. I should have looked there first. Thanks!