kevinzakka / recurrent-visual-attention

A PyTorch Implementation of "Recurrent Models of Visual Attention"
MIT License
468 stars 123 forks source link

why the value of loss function can be negative number? #24

Closed hhu06 closed 4 years ago

hhu06 commented 5 years ago

why the value of loss function can be negative number? what does it mean when tran_loss or vac_loss <0

Epoch: 3/70 - LR: 0.000300 2.7s - loss: -0.948 - acc: 100.000: 100%|██████████| 196/196 [00:02<00:00, 73.88it/s] train loss: 1.554 - train acc: 76.020 - val loss: 2.431 - val acc: 70.833 0%| | 0/196 [00:00<?, ?it/s] Epoch: 4/70 - LR: 0.000300 2.6s - loss: -1.604 - acc: 100.000: 100%|██████████| 196/196 [00:02<00:00, 75.53it/s] train loss: 1.571 - train acc: 76.020 - val loss: 1.200 - val acc: 70.833 0%| | 0/196 [00:00<?, ?it/s] Epoch: 5/70 - LR: 0.000300 2.6s - loss: -1.605 - acc: 100.000: 100%|██████████| 196/196 [00:02<00:00, 76.86it/s] train loss: 0.381 - train acc: 76.531 - val loss: 0.502 - val acc: 70.833

Epoch: 6/70 - LR: 0.000300 2.5s - loss: -0.820 - acc: 100.000: 100%|██████████| 196/196 [00:02<00:00, 77.22it/s] train loss: -0.091 - train acc: 76.020 - val loss: 0.235 - val acc: 70.833

Epoch: 7/70 - LR: 0.000300 2.5s - loss: 0.282 - acc: 50.000: 100%|██████████| 196/196 [00:02<00:00, 77.19it/s] train loss: -0.178 - train acc: 76.020 - val loss: -0.037 - val acc: 72.917 [*]

Epoch: 8/70 - LR: 0.000300 2.5s - loss: 1.670 - acc: 50.000: 100%|██████████| 196/196 [00:02<00:00, 79.95it/s] train loss: -0.814 - train acc: 82.653 - val loss: 1.127 - val acc: 70.833 0%| | 0/196 [00:00<?, ?it/s] Epoch: 9/70 - LR: 0.000300

hhu06 commented 5 years ago

@kevinzakka I‘m a newbie and feel so confused about that, hope you can help me, thanks!

hhu06 commented 5 years ago

@kevinzakka can you give a answer when you are free? I really need your help and hope your reply soon~

seidels commented 5 years ago

@hhu06 If you have a look at trainer.py line 273, you can see that the loss is accumulated into a hybrid loss: loss = loss_action + loss_baseline + loss_reinforce, where:

So if you have not run into an overflow problem your negative loss function value should come from

That means the loss_reinforce is negative, if our action was rewarded better than baseline and the sum of the log probability density functions evaluated at the sampled locations has been larger than 1 (by some degree given you certainty that your sampled location is not completely off the normal distribution you are currently using) and vice versa.

@kevinzakka please correct me if this is wrong

yxiao54 commented 4 years ago

@so-phi the loss_reinforce should be the larger the better. It's confusing to make such hybrid.