Closed jhl-Det closed 1 year ago
I think it's kind of like the hidden_state in RNN, stores previous step k*v and scale, so it defines in a dict, for the initial state I just use an empty dict: incremental_state = {}
Yes, @Anker-ZX-AI is right. It's to store the KV caches.
THANK YOU!
Hi there~, Thanks for your great work RetNet. i have encountered a problem when I try to define "incremental_state". Could you provide me some usage about it or explain more? Thanks, Best regards.