Closed erfanium closed 6 months ago
from fim paper (https://arxiv.org/pdf/2207.14255.pdf) section 3.1: SPM mode can be used to reuse kv cache across completion requests.
SPM modes can enable further latency optimization (which is very important in case of code completion tools). is there any reason that startcoder models are using normal PSM mode?
We train with both modes (50% PSM and 50% SPM), similarily to StarCoder (cf paper). So you can also try SPM mode for inference.
Got it. thanks!
from fim paper (https://arxiv.org/pdf/2207.14255.pdf) section 3.1: SPM mode can be used to reuse kv cache across completion requests.
SPM modes can enable further latency optimization (which is very important in case of code completion tools). is there any reason that startcoder models are using normal PSM mode?