microsoft / torchscale

Foundation Architecture for (M)LLMs
https://aka.ms/GeneralAI
MIT License
3k stars 201 forks source link

the meaning of "incremental_state" in RetNet #42

Closed jhl-Det closed 1 year ago

jhl-Det commented 1 year ago

Hi there~, Thanks for your great work RetNet. i have encountered a problem when I try to define "incremental_state". Could you provide me some usage about it or explain more? Thanks, Best regards.

Anker-ZX-AI commented 1 year ago

I think it's kind of like the hidden_state in RNN, stores previous step k*v and scale, so it defines in a dict, for the initial state I just use an empty dict: incremental_state = {}

shumingma commented 1 year ago

Yes, @Anker-ZX-AI is right. It's to store the KV caches.

jhl-Det commented 1 year ago

THANK YOU!