watson-developer-cloud / python-sdk

:snake: Client library to use the IBM Watson services in Python and available in pip as watson-developer-cloud
https://pypi.org/project/ibm-watson/
Apache License 2.0
1.46k stars 828 forks source link

Switch Speech to Text to use WebSockets #12

Closed germanattanasio closed 6 years ago

germanattanasio commented 9 years ago

HI @daniel-bolanos I was thinking we can provide support for WebSockets in the sdk by adding an existing library and handling the authorization using the authorization service. What do you think?

We should:

mdrx-io commented 8 years ago

I second this!

daniel-bolanos commented 8 years ago

guys, we have this python sample code that uses websockets to talk to the STT service:

https://github.com/watson-developer-cloud/speech-to-text-websockets-python

I wrote it and shared publicly very long ago... we can just merge it

mdrx-io commented 8 years ago

@daniel-bolanos somehow I missed your example. Thanks for sharing it here.

daniel-bolanos commented 8 years ago

no problem!

germanattanasio commented 8 years ago

See https://github.com/HomeHabbit/stt-watson

jsstylos commented 7 years ago

Some more information:

Node SDK implementation: https://github.com/watson-developer-cloud/node-sdk/blob/master/speech-to-text/recognize_stream.js

Documentation: https://www.ibm.com/watson/developercloud/doc/speech-to-text/websockets.shtml

Potential Python Websocket libraries: http://autobahn.ws/python/index.html (used by https://github.com/HomeHabbit/stt-watson, https://github.com/watson-developer-cloud/speech-to-text-websockets-python) https://pypi.python.org/pypi/websocket-client/ (LGPL license) https://websockets.readthedocs.io/en/stable/ (only Python 3.3+) https://ws4py.readthedocs.io (seems like more of a framework that can work different possible websocket libraries) https://anaconda.org/pypi/websocket-client (LGPL, no async functions?)

It's probably worth looking for more and spending some time investigating each of these, but my initial inclination would probably be the Autobahn library, given that it's been used successfully before for the service.

kognate commented 7 years ago

I will fix this today.

germanattanasio commented 7 years ago

Josh I want to know what solution you will propose. If requests provides websocket support that's great if you need to find a new library then we want to use one thar works in all the different Python versions.

Thanks for doing this 🙏 On Fri, Mar 17, 2017 at 8:21 AM Joshua Smith notifications@github.com wrote:

I will fix this today.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/watson-developer-cloud/python-sdk/issues/12#issuecomment-287340190, or mute the thread https://github.com/notifications/unsubscribe-auth/AATHRRISNI1RA2zGol6GxWhNjxBQK_QTks5rmnq_gaJpZM4GPdZA .

kognate commented 7 years ago

Jeff already suggested Autobahn, which is a reasonable choice. I was going to use it (should be fine)

daniel-bolanos commented 7 years ago

Hi @kognate if you go with Autobahn then you can get inspiration from this tool I created a while back: https://github.com/watson-developer-cloud/speech-to-text-websockets-python

btw. I have also tried the Tornado websockets library successfully with STT, which may be even simpler than Autobahn

kognate commented 7 years ago

I was basically going to take most of the code from that @daniel-bolanos

I'll check out tornado. Thanks for the tip.

kognate commented 7 years ago

I'm still working on this, but I'll get it done before monday.

kognate commented 7 years ago

OK, I'm still working on the twisted/tornado version. I've got a version that works for python >=3.5 at https://github.com/kognate/watson_speech_to_text_websockets

While I'm super happy with the async code, I think the version support is going to be an issue. The autobahn lib was giving me a lot of grief integrating it into the existing stt class.

Now that I've pushed the one working version I'm going to try again with autobahn and give up on using a FileSender and try a different strategy.

I did look at tornado, but it seemed to me better to get autobahn going if possible (since it's so widely used).

kognate commented 7 years ago

I just pushed a PR that adds autobahn websockets support. This one needs a lot of review.

daniel-bolanos commented 7 years ago

Hi @kognate @germanattanasio how is this thing going? ready for prime time?

thanks

germanattanasio commented 7 years ago

We want to have a solution that works for different versions of python. This doesn't work for some of those versions and that's why we didn't merge it yet.

daniel-bolanos commented 7 years ago

ok, I think most people still sticking to python 2 ?

kstohr commented 7 years ago

I can't speak for others but I am on python 3. Switched last year. Just tested @kognate's version https://github.com/kognate/watson_speech_to_text_websockets and it worked like a charm with watson-developer-cloud>=0.26.0, Python 3.5.2, on OSX 10.11.6 Thank you! Saved me so much time. Encourage you to release.

kstohr commented 7 years ago

I spoke too soon. The @kognate's solution fails on a longer .wav file. Sigh.

Added issue and some attempts I tried to fix to @kognate gh repo.
Any help getting this to work for larger files (10 min - ideally) would be hugely appreciated. I am a data person, no experience in web sockets.

In the meantime, fyi, I was able to adapt @daniel-bolanos solution to python 3 with minimal changes. Successfully hit the 100MB ceiling making a CLI call. Will try to adapt further to add new Watson features and run in an application, and see if that works out.

kstohr commented 7 years ago

Not sure if this is relevant... although I was able to adapt @daniel-bolanos's solution (see gist), if an exception or an error is raised from a function within the websocket class it can fail silently if the active threads are not properly shutdown. I think it has something to do with [setting .daemon to true to exit ].(https://stackoverflow.com/questions/20596918/python-exception-in-thread-thread-1-most-likely-raised-during-interpreter-shutd)

Something to test for in the SDK, because if the thread is not shut down properly (ie. internet connection issue) you have to effectively restart the interpreter to retry.

Same issue occurs with the CLI version .... (updated to Python 3 here) for anyone looking for something in the meantime...

germanattanasio commented 7 years ago

@kstohr It seems like you are using the work from @kognate. D you think you can continue that work and maybe create a PR? I'm ok with supporting only Python 3 for now.

kstohr commented 7 years ago

@germanattanasio Actually, I ended up updating @daniel-bolanos solution and am using that as Autobahn seems stable other than the outstanding threading issue. It was also easier to adapt his solution for different use-cases, params, requirements. Happy to finish the work on @daniel-bolanos solution and create a PR if that is of interest.

If I have time, I can take another look at @kognate's solution and see if I can adapt that co-routine to deal with larger files, different parameters, etc. This solution is currently very much geared to providing real-time responses. Turning off interim_results, for example, seems to cause the co-routine to fail.

germanattanasio commented 7 years ago

Let's use @daniel-bolanos work then.

kstohr commented 7 years ago

Ok. Do you want the CLI version or the function call? On Thu, Jun 15, 2017 at 12:33 PM German Attanasio notifications@github.com wrote:

Let's use @daniel-bolanos https://github.com/daniel-bolanos work then.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/watson-developer-cloud/python-sdk/issues/12#issuecomment-308843948, or mute the thread https://github.com/notifications/unsubscribe-auth/AH1mdxoLDcl7onon8pAcywJe1m4PEifVks5sEYcFgaJpZM4GPdZA .

-- Sent from a cloud, with a chance of typos.

germanattanasio commented 7 years ago

We want the SpeechToTextV1 class to be updated with the support for WebSockets. The pull from @kognate was trying to do that.

kstohr commented 7 years ago

Ok. On Thu, Jun 15, 2017 at 1:55 PM German Attanasio notifications@github.com wrote:

We want the SpeechToTextV1 class to be updated with the support for WebSockets. The pull from @kognate https://github.com/kognate was trying to do that.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/watson-developer-cloud/python-sdk/issues/12#issuecomment-308863073, or mute the thread https://github.com/notifications/unsubscribe-auth/AH1md8sKiI2jf_ubBiqV5lfjhfWgd5fsks5sEZoogaJpZM4GPdZA .

-- Sent from a cloud, with a chance of typos.

daniel-bolanos commented 7 years ago

thanks @kstohr for working on this, I hear that Watson STT + python + websockets is in very high demand these days. Good stuff.

kstohr commented 7 years ago

Status update: Ok, got it to deal with keyboard interrupt and report back watson errors more gracefully. Need to refactor it (I'd made some changes for my app) to have it play nice with the SDK and will push. Coming soon...

@danielbolanos... very good stuff!

On Thu, Jun 15, 2017 at 2:07 PM, Daniel Bolanos notifications@github.com wrote:

thanks @kstohr https://github.com/kstohr for working on this, I hear that Watson STT + python + websockets is in very high demand these days. Good stuff.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/watson-developer-cloud/python-sdk/issues/12#issuecomment-308866080, or mute the thread https://github.com/notifications/unsubscribe-auth/AH1md8kf3ih35MwMV4FNzQvvWlJTpekVks5sEZ0NgaJpZM4GPdZA .

-- Kas Stohr +1 646 554 7671 <(646)%20554-7671>

kstohr commented 7 years ago

Y'all. I pushed the python 3.5 version of @daniel-bolanos code with 'enhancements' for speaker_labels, etc. I'll be traveling for a few weeks, but feedback welcomed.

germanattanasio commented 7 years ago

@kstohr I have a lot of work on my plate but I will try to look at the PR this week. Sorry for the delay

kstohr commented 7 years ago

No problem. I am on vacation. So apologies for any future delayed response on my end... Also, just to be clear, I adapted it for python 3.5. But didn't have a chance to back port it to 2.7. Although that change should be relatively easy and it should support multiple versions. (That's the advantage of using Autobahn with Twisted. Double check, but I think Asyncio only became standard in vers. 3.4).

kstohr commented 7 years ago

@germanattanasio just checking in to see the status of the PR.

germanattanasio commented 7 years ago

@ehdsouza was looking at this Yesterday @kstohr. I will follow up with her

kstohr commented 7 years ago

@germanattanasio @ehdsouza shout if you need help back-porting to 2.7 or have any q's.

ehdsouza commented 6 years ago

This has been taken care in PR: https://github.com/watson-developer-cloud/python-sdk/pull/376. It would be released in v1.1.0 coming shortly.