Unable to reproduce cache aware streaming results for Conformer that were there for Fastconformer.

I trained Conformer-CTC using cache aware method. but unable to reproduce the results that were there with "Fastconformer cache aware streaming". By using Conformer-CTC cache aware method i get partial sub-words, words combined together like ("hello world" ==> herld) , in streaming output.

the same was the case when i deployed it in a RIVA pipeline.

but when i use the "transcribe" function i get proper transcription for the audio file. @titu1994 @VahidooX can you please enlighten on this. When to use and how to use streaming models in .nemo format and in RIVA pipeline. My question has some relevance with these issues [https://github.com/NVIDIA/NeMo/discussions/7010] [https://github.com/NVIDIA/NeMo/discussions/5284]

NVIDIA / NeMo

Unable to reproduce cache aware streaming results for Conformer that were there for Fastconformer. #9495