CognitiveBuild / Thoth

a Conversational Agent, the Robot Brain of GBS Innovation Center
https://github.com/CognitiveBuild
Apache License 2.0
1 stars 2 forks source link

Create Ogg/Opus codec wrapper for Speech-to-Text end-to-end streaming #5

Closed mihui closed 7 years ago

mihui commented 8 years ago

Create Ogg/Opus codec wrapper (DLL/SubProject) for Speech-to-Text end-to-end streaming to reduce latency.

References: http://www.opus-codec.org/ https://wiki.xiph.org/OggOpus http://www.ibm.com/watson/developercloud/doc/speech-to-text/websockets.shtml

mihui commented 8 years ago

The audio format needs to be: audio/ogg;codecs=opus for Watson Speech API to consume

mihui commented 7 years ago

@Hamasn , please investigate on this, port the C code as a wrapper for .NET projects. You may use the 3rd party WebSocket for testing, please consult with @yidlhu which WebSocket library you should use, then integrate with Microphone callback method to send out the encoded data stream.

mihui commented 7 years ago

Any updates guys?

yidlhu commented 7 years ago

can we put this task to next sprint? after we have all functions. working on adding text to speech function and show images/video function.

yidlhu commented 7 years ago

I found a opus/ogg solution on thothWPF, works good.