Question: configure voice detection delay/debounce for `transcribe-stream`?

Hi, I'm using voice2json transcribe-stream with short one-word commands (to control a multirotor drone, e.g. "left" "right" "up"). Ideally I'd like the detector to respond as soon as possible after a word, but currently voice2json seems to wait a minimum of 2 seconds after it detects a voice to pass the audio into the transcriber, given by the 'end time' of the tokens object. Furthermore, if there's significant background noise (say, a buzzing quadcopter), voice2json continues to record for up to 15 seconds before passing back the audio for transcription and emitting the json line.

Is there any way to configure the min/max delay for commands? I tried the --timeout option, but even with --timeout 0 the latency from utterance to json line seems the same.

synesthesiam / voice2json

Question: configure voice detection delay/debounce for `transcribe-stream`? #61