fastai / fastbook

The fastai book, published as Jupyter Notebooks
Other
21.51k stars 8.33k forks source link

BCEWithLogitsLossFlat and BCEWithLogitsLoss #582

Open kfkelvinng opened 1 year ago

kfkelvinng commented 1 year ago

I tried to understand the difference, but they are almost identical in term of behaviour. In the documentation https://docs.fast.ai/losses.html#bcewithlogitslossflat, nn.BCEWithLogitsLoss suppose to fail but it doesn't.

tst = BCEWithLogitsLossFlat()
output = torch.randn(32, 5, 10)
target = torch.randn(32, 5, 10)
#nn.BCEWithLogitsLoss would fail with those two tensors, but not our flattened version.
_ = tst(output, target)
test_fail(lambda x: nn.BCEWithLogitsLoss()(output,target))

The correct test should be:

test_fail(lambda: nn.BCEWithLogitsLoss()(output,target))

instead of

test_fail(lambda x: nn.BCEWithLogitsLoss()(output,target))

In contrast to https://docs.fast.ai/losses.html#crossentropylossflat, the nn.CrossEntropyLoss fail but the test is misleading.