watson-developer-cloud / speech-to-text-websockets-python

Python client that interacts with the IBM Watson Speech To Text service through its WebSockets interface
http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/speech-to-text.html
86 stars 63 forks source link

Option to Name Output File #5

Closed kylanhurt closed 9 years ago

kylanhurt commented 9 years ago

First of all thank you for the hard work that you have done on this project. It has certainly worked much better with large files than the SESSIONS / cURL method we were using before.

Again, I do not have much experience with Python so customizing the Python script is very challenging for me. Currently the script outputs to a 0.txt or 0.json file, and I was wondering if you might be willing to add a feature (or show me how to customize the script) to name the output file as we'd like or, at the very least, have the output file be given the same filename (obviously not extension) as the input file (or first file of multiple, etc).

I'm sure you guys are very busy but if you ever get a little bit of time to add such a feature it'd be greatly appreciated! My bread and butter is PHP but that doesn't do much good on a project like this. I suppose you could also show me how to set the output filename variable from the command line if that's easier. Thank you!

daniel-bolanos commented 9 years ago

I'm glad you tried this WebSockets client. CURL is definitely not the way to go for this usecase.

Take a look at this section of the code:


def setUtterance(self, utt):
      self.uttNumber = utt[0]
      self.uttFilename = utt[1]
      self.summary[self.uttNumber] = {"hypothesis":"", "status":{"code":"", "reason":""}}
      self.fileJson = self.dirOutput + "/" + str(self.uttNumber) + ".json.txt"
      try:
         os.remove(self.fileJson)
      except OSError:
         pass

self.uttFilename contains the input filename with the path self.fileJson is set to the name of the output file, you can change this to meet your needs

kylanhurt commented 9 years ago

Got it. Thanks a ton Dani. I wanted the output filename to be very similar to the input filename (except JSON rather than WAV) and realized that I could write a separate script to do this since the output JSON files appear to correspond with the same order from the recordings.txt file. Your solution also makes sense, and I recently figured out that you can set variables from the command line as well so I suppose there are many solutions to the issue =D

The multi-threading with the web sockets interface is excellent. Is there a limit to how many threads IBM is willing to accept from one user / request?

daniel-bolanos commented 9 years ago

Hello @kylanhurt , yes the output file names follow the order of the input files, otherwise it would be very confusing :)

I'm glad you like the multithreaded WS interface, there is a very high limit that we have that corresponds to the maximum capacity of the service, but believe me, it si very large and elastic, so do not worry about hitting the limit, feel free to send as much as you want.