During the training and verification process, when "step = 0", the "coverage" is initialized differently. During training, the coverage is an all-zero tensor, but this is not the case during prediction.

atulkum / pointer_summarizer

pytorch implementation of "Get To The Point: Summarization with Pointer-Generator Networks"

Apache License 2.0

904 stars 243 forks source link

During the training and verification process, when "step = 0", the "coverage" is initialized differently. During training, the coverage is an all-zero tensor, but this is not the case during prediction. #44

Closed 997261095 closed 4 years ago

997261095 commented 4 years ago

What is the reason for this?

https://github.com/atulkum/pointer_summarizer/blob/a41b69ba4f7eb6ffdeaf99a35bf9c6607ca5db56/training_ptr_gen/model.py#L152

atulkum commented 4 years ago

this step variable is coming from here (the di variable) https://github.com/atulkum/pointer_summarizer/blob/a41b69ba4f7eb6ffdeaf99a35bf9c6607ca5db56/training_ptr_gen/train.py#L90

This is done so that coverage should be initialize in the the very first iteration. You can think of it as uniform coverage distribution at the start of decoding. (as per my understanding)