I would like to chain additional processing steps after the recognition has been completed. This allows the inclusion of other cool things to be executed on top of the speech alone: sentiment analysis, topic understanding, speaker detection, etc.
Here's a rough sketch of the concept...
Each may follow their own "server" file that launches a new service so that they don't complicate the existing single-server architecture
Each would communicate over web calls (REST) to avoid process confusion; in the future, we could expand it to be something more rigorous like a message queue.
Each can communicate via stored JSON/metadata or audio files written to disk
Each can "register" itself with the main speech server as a secondary process on start-up. For example, the "speaker detection" module will (a) launch it's own service, (b) register with primary server, (c) accept REST calls and reply with JSON / text as required
Seeking opinions at this point with more details to be flushed out later. Of course, eventually we may convert this suite into a package (e.g. satisfying #2), but that's not paramount right now.
I would like to chain additional processing steps after the recognition has been completed. This allows the inclusion of other cool things to be executed on top of the speech alone: sentiment analysis, topic understanding, speaker detection, etc.
Here's a rough sketch of the concept...
Seeking opinions at this point with more details to be flushed out later. Of course, eventually we may convert this suite into a package (e.g. satisfying #2), but that's not paramount right now.