No code-path supporting cache-aware streaming for SenseVoiceSmall Model?

Notice: In order to resolve issues more efficiently, please raise issue following the template. （注意：为了更加高效率解决您遇到的问题，请按照模板提问，补充细节）

❓ Questions and Help

Before asking:

search the issues.
search the docs.

What is your question?

Does this model support cache-aware streaming?

The paper reference in SenseVoiceEncoderSmall, as well as the forward_chunk methods in the SinusoidalPositionEncoder and MultiHeadedAttentionSANM, suggest to me that this is intended to support cache-aware streaming. However, I don't see a code-path in the SenseVoiceSmall model that would support that.

Am I missing something? If this is not supported yet, is it planned for the future?

Looking at the FunASR example for streaming models in the readme , I would expect this to support a streaming api like:

cache = {}
total_chunk_num = int(len((speech)-1)/chunk_stride+1)
for i in range(total_chunk_num):
    speech_chunk = speech[i*chunk_stride:(i+1)*chunk_stride]
    is_final = i == total_chunk_num - 1
    res = model.generate(input=speech_chunk, cache=cache, is_final=is_final, chunk_size=chunk_size, encoder_chunk_look_back=encoder_chunk_look_back, decoder_chunk_look_back=decoder_chunk_look_back)
    print(res)

Thanks!

FunAudioLLM / SenseVoice