Open tyarkoni opened 5 years ago
On further investigation, it looks like we're not actually using speech_recognition
to query the IBM API, but only to encode the audio. I have a vague recollection of you implementing it this way to solve some earlier problem with speech_recognition
, @qmac. Either way, should be an easy fix—I'll just change the way we make the request in the IBMSpeechAPIConverter
.
Sounds good. Yea I did a split from speech_recognition
when we wanted to request timestamps from the API. We could probably remove the dependency and write the audio encoding code ourselves.
I went down a bit of a rabbit hole on this one... I think the IBM APIs for this are in a state of flux—they currently have 3 (!) different ways of authenticating (user/pass, service-specific key, and global IAM api key), and I'm fairly certain that the latest version of the API (released like 3 days ago) isn't actually supported by the watson-developer-cloud client at the moment. I imagine this will get fixed in the next few weeks, so I'll come back to this—just leaving this note to self here for posterity.
(We could also stick with the current approach and keep making direct HTTP requests, but they seem to discourage that now, plus the client library has some other nice features.)
IBM's cloud services recently switched from a username/pass pair to an API key. Unfortunately,
speech_recognition
hasn't been updated yet (see open issue). This means we'll need to either switch to using IBM's own client library (probably the best long-term solution) or patchspeech_recognition
.