Lab 3 - base.py Acccuracy.update() Error:

the-full-stack / fsdl-text-recognizer-2021-labs

Complete deep learning project developed in Full Stack Deep Learning, Spring 2021

https://bit.ly/berkeleyfsdl

MIT License

452 stars 281 forks source link

Lab 3 - base.py Acccuracy.update() Error: #27

Open just-eoghan opened 3 years ago

just-eoghan commented 3 years ago

System specs XPS 13 Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz 1.99 GHz NVIDIA Geforce Rtx 2080 Super

Problem

Running the following: python training/run_experiment.py --max_epochs=10 --gpus=1 --num_workers=4 --data_class=EMNISTLines --min_overlap=0 --max_overlap=0 --model_class=LineCNNSimple --window_width=28 --window_stride=28

Results in the following error: ValueError: Probabilities inpredsmust sum up to 1 accross theCdimension.

Solution

I managed to track down the error to the update function within the Accuracy class in base.py.

The offending line is: preds = torch.nn.functional.softmax(preds, dim=-1)

Where the dim=-1 paramater is causing this value error. Setting this to dim=1 solves the issue and allows training to take place.

I don't fully understand why this is the case or why this error presented in the first place. Any guidance would be appreciated!

mprostock commented 3 years ago

Stumbled accross this myself, just created a PR to fix it.

The reason for the problem is that when using the new models in Lab 3 like SimpleLineCNN or the LineCNN the predictions get a 3rd dimension, because it is a sequence of letters now. In Lab1/2 we were predicting single letters only.

The Accuracy Fix/Hack uses dim=-1, which works as long as there are only 2 dimensions (batch, class), but from lab3 does the softmax over the wrong dimension. (Dims are [128, 83, 32] (bs, num_classes, len_seq)). So setting the softamax to use dims=1 instead of dims=-1 makes it use the correct dimension of classes to "softmax over".

just-eoghan commented 3 years ago

Thanks for that Marc makes sense. I'll +1 your PR!

just-eoghan commented 3 years ago

Closing this now as there is a PR by @mprostock in progress.

mprostock commented 3 years ago

I understand your apporach, but it is common practice to leave tickets/issues open until resolved (or replied to by maintainer). My PR might not get accepted, they might choose to fix it in several other different ways. Until then it would be good to keep the issue open, so that other people can easily verify they are not alone with their problem and that this issue exists, until it is actually fixed in the code. So - could you reopen this issue?

just-eoghan commented 3 years ago

Point taken, reopened.

Caizifen commented 2 years ago

Thanks. Having the same problem.