Question about the recurrent forward of MultiScaleRetention

microsoft / torchscale

Foundation Architecture for (M)LLMs

https://aka.ms/GeneralAI

MIT License

3.01k stars 202 forks source link

Closed LEECHOONGHO closed 1 year ago

LEECHOONGHO commented 1 year ago

In the multiscale-retention`s recurrent forward, it looks like the incremental state is not being updated(not returned)[1].

shumingma commented 1 year ago

The incremental state is a Python dictionary, so it's updated in place.

LEECHOONGHO commented 1 year ago

Oh I see. sorry for my mistake.