Currently we only accept a base64-encoded audio file, which we then convert to text with deepspeech. For very short commands this works, but it adds unnecessary delay as processing can't start until the complete command has arrived.
Instead, we need to accept an audio stream, allowing deepspeech to convert the text in near-real-time.
Look into whether to use raw sockets, web sockets, or something else, to accomplish this.
Currently we only accept a base64-encoded audio file, which we then convert to text with deepspeech. For very short commands this works, but it adds unnecessary delay as processing can't start until the complete command has arrived.
Instead, we need to accept an audio stream, allowing deepspeech to convert the text in near-real-time.
Look into whether to use raw sockets, web sockets, or something else, to accomplish this.