This work doesn't change kernel, but utilize dependency to compute a whole line?

Your idea is very excellent and I have starred your repo. I want to check my understanding's correctness:

This paper does not modify the kernel implementation but instead considers that different rows in the sequence dimension of Q are independent. Therefore, it calculates from attention to FFN in one go, which quickly consumes intermediate results and allows for the computation of larger sequence lengths.

Is it correct?

haoliuhl / ringattention

This work doesn't change kernel, but utilize dependency to compute a whole line? #20