Closed burcehan closed 4 years ago
Hmm, I am not sure I follow. If you are using the causal-linear model then why do you need the segment level recurrence?
Regardless of the above, if you do want to add a segment level recurrence, I would do it as a module that contains the segment transformer and not edit the CUDA code. This should be implementable at a higher level.
Cheers, Angelos
I use the causal-linear model, I try to calculate this matrix,that the shape of Q is (L,D), the shape of K is (S,D),the shape of V is (S,D),D is dimension ,L and S are the sequence length ,they are not equal ,I try this in the causal-linear model,but an error was returned.It does not support such calculations
Hi,
f165966 should be fixing this issue (namely L and S can be different now). If you have more questions or if the issue should not be closed feel free to reopen it or open another issue.
Cheers, Angelos
Hi, Thanks for your great work! I have some questions, if I want to use segment-level recurrence with state reuse like Transformer-XL in language model,how to do this ,Should I rewrite the code in causal_product_cuda.cu Thanks for your help.