Open vongrad opened 1 month ago
Yes please. A PR would be welcome. Thank you for diving into that problem. I won’t be able to review properly for a couple of weeks but I do appreciate the PR.
I have made the promised PR: https://github.com/csdcorp/speech_to_text/pull/513
I have been struggling with an issue where iOS does not recognize the first word after calling the .listen() for second time and further, i.e.
listen (all good here)
->stop
->listen (missed first word)
->stop
-> etc...I have set up a native iOS app using your
SwiftSpeechToTextPlugin.swift
and tried to debug what is causing it. What I have noticed is that if I benchmark the time fromtry self.audioEngine.start()
to the first buffer received in the callback ofinputNode?.installTap
, there is approx.175ms
, which is sufficient to catch the first word. However on the second+ call tolistenForSpeech
, the same benchmark results in approx.850ms
, which is more than enough to miss the first word.After experimenting a bit, I noticed that instantiating a new
audioEngine
and of courseinputNode
fixes this issue and we are back on cirka175ms
before receiving the first buffer on second+ calls. I did not try to dig deeper into why reusingaudioEngine
produces such a delay even though all related resources seems to be deallocated properly looking at your code.If you want, I can make a PR that implements the suggested fix - let me know if I should go for it.