carpentries-incubator / deep-learning-intro

Learn Deep Learning with Python
https://carpentries-incubator.github.io/deep-learning-intro/
Other
31 stars 36 forks source link

Set baseline expectations for categorical cross-entropy #472

Open qualiaMachine opened 5 months ago

qualiaMachine commented 5 months ago

In the same spirit as the regression episode, I thought it might be useful to establish a baseline expectation in terms of the categorical cross-entropy loss metric. You can calculate the expected performance of a model that guesses at random by using log(n), where n is the number of classes

github-actions[bot] commented 5 months ago

Thank you!

Thank you for your pull request :smiley:

:robot: This automated message can help you check the rendered files in your submission for clarity. If you have any questions, please feel free to open an issue in {sandpaper}.

If you have files that automatically render output (e.g. R Markdown), then you should check for the following:

Rendered Changes

:mag: Inspect the changes: https://github.com/carpentries-incubator/deep-learning-intro/compare/md-outputs..md-outputs-PR-472

The following changes were observed in the rendered markdown documents:

 2-keras.md | 10 +++++++++-
 md5sum.txt |  4 ++--
 2 files changed, 11 insertions(+), 3 deletions(-)
What does this mean? If you have source files that require output and figures to be generated (e.g. R Markdown), then it is important to make sure the generated figures and output are reproducible. This output provides a way for you to inspect the output in a diff-friendly manner so that it's easy to see the changes that occur due to new software versions or randomisation.

:stopwatch: Updated at 2024-06-04 16:57:15 +0000

qualiaMachine commented 5 months ago

Thank you for your comment, @svenvanderburg! I agree that a baseline measure in terms of accuracy is probably more intuitive. However, I would argue that it's useful for real-world deep learning practitioners to know the equation necessary for establishing a baseline using cross-entropy loss. It's a very common loss metric that comes up in all kinds of classification problems. Even if they don't understand the math fully, it can be useful thing to memorize down the line and can help you detect problems while the model is still training. In contrast, interpreting the confusion matrix / accuracy is intuitive enough that I'm not sure it's worth a callout.

svenvanderburg commented 5 months ago

@qualiaMachine Agree that it is useful to know. Although I worked on deep learning for many years with just an intuitive understanding of crossentropy without the mathematics 🙈😂.

Maybe it's an idea that we introduce it in episode 4? There was also use categorical crossentropy if I am not mistaken. That we we keep episode 2 relatively clean and not overwhelm students.

Otherwise I suggest to put the addition that you currently have in a callout box, and add a little bit more context. I think to fully understand why the loss would be log(n) you need some more explanation of the mathematics.

qualiaMachine commented 5 months ago

I totally get the desire to keep episode 2 light. The only thing that has me wanting to stick to episode 2 is that's where we introduce categorical cross-entropy. I think my explanation of the baseline math might make more sense if I also write a little paragraph unpacking categorical cross-entropy loss in episode 2. I can probably do that all as a callout box if that seems most appropriate? I could also concede and stick to episode 4 if you'd really like -- I won't die on this hill haha.

svenvanderburg commented 5 months ago

OK, let's go for a callout box explaining crossentropy loss and introducing the baseline loss in episode 2. We can always move it to 4 if it doesn't work.