intel / xFasterTransformer

Apache License 2.0
344 stars 60 forks source link

[Common] Modify resize() in DecoderContext to support #367

Closed pujiang2018 closed 4 months ago

pujiang2018 commented 4 months ago

Refactor resize function in DecoderContext to support Continuous Batching, and removed qkScores member (since it is rarely used and the attention impl. most likely would like to mange the buffer by itself).