Closed Yuran-Zhao closed 1 year ago
Hello Yuran,
Thank you for working with our method and reaching out for clarification.
The line you're inquiring about corresponds to the computation of the "full decoder layer contribution matrix", as detailed in the closing part of Section 3.1 in the paper. In particular, it's associated with the process of "weighting every row of C{y←y{<t}} by the corresponding value of the residual contribution of each time step".
If you have any further questions or need more detailed explanations on any aspect, please don't hesitate to ask.
Best regards, Gerard
Hi Gerard,
Thanks for your quick reply! It is clear now after your explanations :)
Hi Ferrando,
I personally find your work very interesting and thanks for your efforts in providing open source code.
However, I'm a little confused about the purpose of Line 588 in the transformer-contributions-nmt/wrappers/transformer_wrapper.py. Why can't we just concatenate
self_dec_contributions
andcross_contributions
directly just like Line 589?