microsoft / torchscale

Foundation Architecture for (M)LLMs
https://aka.ms/GeneralAI
MIT License
3.01k stars 202 forks source link

Question about the recurrent forward of MultiScaleRetention #62

Closed LEECHOONGHO closed 1 year ago

LEECHOONGHO commented 1 year ago

In the multiscale-retention`s recurrent forward, it looks like the incremental state is not being updated(not returned)[1].

[1]https://github.com/microsoft/torchscale/blob/258eda33083f6361e7305f2a5afd241e381826e1/torchscale/component/multiscale_retention.py#L117C18-L117C18

shumingma commented 1 year ago

The incremental state is a Python dictionary, so it's updated in place.

LEECHOONGHO commented 1 year ago

Oh I see. sorry for my mistake.