Open EomSooHwan opened 5 months ago
Hi, I am wondering if there is any way to change the stride of the local attention window.
For example, i-th query attends to keys in [i stride + seqlen_q - seqlen_k + win_size[0], i stride + seqlen_q - seqlen_k + win_size[1]]
No that's not supported.
Hi, I am wondering if there is any way to change the stride of the local attention window.
For example, i-th query attends to keys in [i stride + seqlen_q - seqlen_k + win_size[0], i stride + seqlen_q - seqlen_k + win_size[1]]