Open WYL-Projects opened 2 years ago
Hello,
Thank you for your interest in our work. Good questions.
Yes, we use A to represent the number of same-scale neighbor nodes that a middle node in the sequence (or, most nodes in the sequence) can attend to. The number of same-scale neighbor nodes that can be attended to by nodes near the leftmost and rightmost in the sequence is less than A. In equations (8) and (12), we take this into account and take the upper bound A to compute complexity.
Yes, the diagram of A=5 is right. Your understanding is right.
Hello, Author Thanks for the high performance pyramid attention you have proposed. However, while I was reviewing the paper I came across several difficulties as follows: