wit-ai / pywit

Python library for Wit.ai
Other
1.46k stars 361 forks source link

Get text from audio #38

Closed walchko closed 8 years ago

walchko commented 8 years ago

Can you please write a complete library? Please include a function for speech (link to your API) passed as an audio file. Basically it does this (per your docs):

  $ curl -XPOST 'https://api.wit.ai/speech?v=20141022' \
   -i -L \
   -H "Authorization: Bearer $TOKEN" \
   -H "Content-Type: audio/wav" \
   --data-binary "@sample.wav"
oplatek commented 8 years ago

Speech API would be nice! Any update on this?

I know I can hack it and submit a speech request to your or any other speech API and than submit the 1-best hypothesis to your converse API. However, as your (speech) API is quite slow, the latency is not trivial and the user experience horrible just because I need to submit two requests instead of one. If you would provide a converse API through speech directly it would speed up things considerably.

jhoelzl commented 8 years ago

+1

goose121 commented 8 years ago

I also think that this would be great; after all, there's not much of a point to natural speech if you can't actually speak

lowdev commented 8 years ago

+1

milindaj commented 8 years ago

+1 converse API through speech directly is a great feature to have

andehr commented 8 years ago

+1

Accentrix commented 8 years ago

This feature would make the Pywit library perfect! still waiting.... :/

blandinw commented 8 years ago

Hi everybody, apologies for the lack of responsiveness here and thanks for keeping this issue alive. We used to have audio recording + streaming in the first versions of the library, but it was a constant source of pain, as it involved a lot of platform specific code.

Regarding audio recording (from a microphone device), I don't think it makes sense to add that to pywit, as it's highly platform specific and does not make sense for server-side use cases.

Regarding the network streaming part, we'd be open to add back a method .speech() to the client that takes a "stream of bytes" (what's the idiomatic way to reprensent that?), uploads it to Wit and returns the response object. We'd need to come up with a solution that works on both Python 2 and 3. We may come around to doing that, but we're working on some other awesome things at the moment. Contributions welcome!

walchko commented 8 years ago

You might want to actually read what I was asking for ... I never asked you to capture audio. Just make python as complete as your http api so I can send an audio file for you to interpret ... it is simple!

You also might want to check your pull requests ... Method added to upload voice commands #67 already already does this. I independently implemented a very similar solution long ago, but was far too lazy to submit a pull request. @willywongi however did, so please take a look at his work and consider committing it.

blandinw commented 8 years ago

I commented on the PR, hopefully @willywongi can get around to implementing the last bit soon. We'll merge then.

willywongi commented 8 years ago

"Good news everyone!" I pushed the correction @blandinw was asking - I forgot to allow users to set the correct content-type header.

blandinw commented 8 years ago

Thank you @willywongi! I merged your PR + bumped Wit to 4.2.0 on PyPI.

sergios-ferreira commented 2 years ago

Can you please write a complete library? Please include a function for speech (link to your API) passed as an audio file. Basically it does this (per your docs):

  $ curl -XPOST 'https://api.wit.ai/speech?v=20141022' \
   -i -L \
   -H "Authorization: Bearer $TOKEN" \
   -H "Content-Type: audio/wav" \
   --data-binary "@sample.wav"

curl -XPOST "https://api.wit.ai/speech?v=20211113" \ -i -L \ -H "Authorization: Bearer [YOUR_TOKEN]" \ -H "Content-Type: audio/raw;encoding=signed-integer;bits=16;rate=44100;endian=little" \ --data-binary "@[YOUR_AUDIO].wav"

Remember: @ front of YOUR_AUDIO is important.