amritasaha1812 / CSQA_Code

59 stars 20 forks source link

`kvmem` vs `decoder` loss? #8

Closed sanyam5 closed 5 years ago

sanyam5 commented 6 years ago

What is the difference between kvmem loss and decoder loss? How do you decide which to use?

vardaan123 commented 6 years ago

both decoder and kvmem losses are required for the model to train Decoder loss: This loss will make the model output correct tokens in natural language (reduce the cross-entropy between the predicted and target vocab. prob. distribution for each step in decoding.

kvmem loss: This loss is used to train the key-value memory network. The model should learn to give the weights to different locations in memory values similar to the gold target entities (which are input to the model in training mode). We minimize the cross-entropy here.