Closed walchko closed 8 years ago
Speech API would be nice! Any update on this?
I know I can hack it and submit a speech request to your or any other speech API and than submit the 1-best hypothesis to your converse API. However, as your (speech) API is quite slow, the latency is not trivial and the user experience horrible just because I need to submit two requests instead of one. If you would provide a converse API through speech directly it would speed up things considerably.
+1
I also think that this would be great; after all, there's not much of a point to natural speech if you can't actually speak
+1
+1 converse API through speech directly is a great feature to have
+1
This feature would make the Pywit library perfect! still waiting.... :/
Hi everybody, apologies for the lack of responsiveness here and thanks for keeping this issue alive. We used to have audio recording + streaming in the first versions of the library, but it was a constant source of pain, as it involved a lot of platform specific code.
Regarding audio recording (from a microphone device), I don't think it makes sense to add that to pywit, as it's highly platform specific and does not make sense for server-side use cases.
Regarding the network streaming part, we'd be open to add back a method .speech()
to the client that takes a "stream of bytes" (what's the idiomatic way to reprensent that?), uploads it to Wit and returns the response object. We'd need to come up with a solution that works on both Python 2 and 3. We may come around to doing that, but we're working on some other awesome things at the moment. Contributions welcome!
You might want to actually read what I was asking for ... I never asked you to capture audio. Just make python as complete as your http api so I can send an audio file for you to interpret ... it is simple!
You also might want to check your pull requests ... Method added to upload voice commands #67 already already does this. I independently implemented a very similar solution long ago, but was far too lazy to submit a pull request. @willywongi however did, so please take a look at his work and consider committing it.
I commented on the PR, hopefully @willywongi can get around to implementing the last bit soon. We'll merge then.
"Good news everyone!" I pushed the correction @blandinw was asking - I forgot to allow users to set the correct content-type header.
Thank you @willywongi! I merged your PR + bumped Wit to 4.2.0 on PyPI.
Can you please write a complete library? Please include a function for speech (link to your API) passed as an audio file. Basically it does this (per your docs):
$ curl -XPOST 'https://api.wit.ai/speech?v=20141022' \ -i -L \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: audio/wav" \ --data-binary "@sample.wav"
curl -XPOST "https://api.wit.ai/speech?v=20211113" \ -i -L \ -H "Authorization: Bearer [YOUR_TOKEN]" \ -H "Content-Type: audio/raw;encoding=signed-integer;bits=16;rate=44100;endian=little" \ --data-binary "@[YOUR_AUDIO].wav"
Can you please write a complete library? Please include a function for speech (link to your API) passed as an audio file. Basically it does this (per your docs):