Closed sanghyuk-choi closed 1 year ago
I think this is done on purpose, but @pltrdy may explain better.
It actually seems like a mistake to me, a^t
being included in the summation we have min(a^t, c^t) = a^t
which does not really make sense.
@Sanghyuk-Choi if you're still around can you PR ?
Yes, I'll make PR within this week.
2022년 12월 8일 (목) 01:37, Vincent Nguyen @.***>님이 작성:
@Sanghyuk-Choi https://github.com/Sanghyuk-Choi if you're still around can you PR ?
—
following abisee(2017), coverage attention equation is
but code implementation below,
coverage
seems containingt
at timestep t, and it should be sum from0
tot-1
https://github.com/OpenNMT/OpenNMT-py/blob/36748a5f280fbe781a86a82b7df85166796d49d7/onmt/decoders/decoder.py#L412-L415 so I think L413 would be replaced like thishttps://github.com/OpenNMT/OpenNMT-py/blob/36748a5f280fbe781a86a82b7df85166796d49d7/onmt/utils/loss.py#L308-L311 and coverage loss would be replaced like this
check if this correct and if necessary, I'll open a pull request. thank you.