Open Xiyu-AI opened 12 months ago
train.py:
non_pad_mask = cap_labels[:, 1:].ne(Constants.PAD) n_word = non_pad_mask.sum().item() cms_non_pad_mask = cms_labels[:, 1:].ne(Constants.PAD) cms_n_word = cms_non_pad_mask.sum().item() cap_loss /= n_word cms_loss /= n_word
I'm a bit curious about the calculations. When computing the cap_loss and cms_loss, why are they both divided by n_word? And, why isn't cms_loss divided by cms_n_word? I'd appreciate your clarification. Thank you!
train.py:
compute the token prediction Acc.
non_pad_mask = cap_labels[:, 1:].ne(Constants.PAD) n_word = non_pad_mask.sum().item() cms_non_pad_mask = cms_labels[:, 1:].ne(Constants.PAD) cms_n_word = cms_non_pad_mask.sum().item() cap_loss /= n_word cms_loss /= n_word
I'm a bit curious about the calculations. When computing the cap_loss and cms_loss, why are they both divided by n_word? And, why isn't cms_loss divided by cms_n_word? I'd appreciate your clarification. Thank you!