Closed jinmang2 closed 3 years ago
@jinmang2 Hi MyungHoon! I think you are right, thank you for catching this bug - I've released the changes in 0.1.6 https://github.com/lucidrains/h-transformer-1d/releases/tag/0.1.6 , do you want to review this and see if it resolve the issue you described?
Thanks for solving it so quickly 😄
Thank you for the opportunity to review, I will check the 0.1.6 version code and leave a comment!
closing, since i think its fine now
Paper Implementation
In the implementation, we get blocked Q, K, V tensors by level with the code below.
https://github.com/lucidrains/h-transformer-1d/blob/110cab0038898560d72d460bfef8ca8b7f17f0a5/h_transformer_1d/h_transformer_1d.py#L164-L179
And return the final result of matrix-matrix product with Equation 29 or 69 with the for loop below.
https://github.com/lucidrains/h-transformer-1d/blob/110cab0038898560d72d460bfef8ca8b7f17f0a5/h_transformer_1d/h_transformer_1d.py#L234-L247
What is problem?
However, according to the current code, it is not possible to include information about the level 0 white blocks in the figure below. (Equation 70 of the paper includes the corresponding attention matrix entries.)
I think you should also add an off-diagonal term of near-interaction (level 0) to match Equation 69!