support lm prefix computation in one go

microsoft / torchscale

Foundation Architecture for (M)LLMs

https://aka.ms/GeneralAI

MIT License

3.01k stars 202 forks source link

support lm prefix computation in one go #33

Closed XingxingZhang closed 1 year ago

XingxingZhang commented 1 year ago

In LM decoding with prefix (e.g., prompt), we can compute all prefix hidden states all together in the first step by setting incremental_state["is_first_step"] = True