train.py:

compute the token prediction Acc.

non_pad_mask = cap_labels[:, 1:].ne(Constants.PAD) n_word = non_pad_mask.sum().item() cms_non_pad_mask = cms_labels[:, 1:].ne(Constants.PAD) cms_n_word = cms_non_pad_mask.sum().item() cap_loss /= n_word cms_loss /= n_word

I'm a bit curious about the calculations. When computing the cap_loss and cms_loss, why are they both divided by n_word? And, why isn't cms_loss divided by cms_n_word? I'd appreciate your clarification. Thank you!

jacobswan1 / Video2Commonsense

The question of computing the token prediction Acc. #11

compute the token prediction Acc.