fastai / fastbook

The fastai book, published as Jupyter Notebooks
Other
21.41k stars 8.3k forks source link

Small correction to 06_multicat #647

Open andrewkchan opened 4 months ago

andrewkchan commented 4 months ago

I believe the first sentence of this is not worded correctly:

Note that because we have a one-hot-encoded dependent variable, we can't directly use nll_loss or softmax (and therefore we can't use cross_entropy):

  • softmax, as we saw, requires that all predictions sum to 1, and tends to push one activation to be much larger than the others (due to the use of exp); however, we may well have multiple objects that we're confident appear in an image, so restricting the maximum sum of activations to 1 is not a good idea. By the same reasoning, we may want the sum to be less than 1, if we don't think any of the categories appear in an image.
  • nll_loss, as we saw, returns the value of just one activation: the single activation corresponding with the single label for an item. This doesn't make sense when we have multiple labels.

One-hot encoding does not disallow the use of softmax. The reason should be that the objective is multi-label, right? As the bullet points explain.