It seems like target needs to be normalized in the classification case, but I don't see anywhere in the code where that'd be happening. (Note that I haven't actually run the code to prove it's misbehaving, but am just reading it and this didn't make sense to me. Did I miss something?)
According to the docs:
But if you have a target of 0 in the loss function for negative cases then you don't learn anything because your loss is always 0:
It seems like target needs to be normalized in the classification case, but I don't see anywhere in the code where that'd be happening. (Note that I haven't actually run the code to prove it's misbehaving, but am just reading it and this didn't make sense to me. Did I miss something?)