agouil / podcast-transcriber

A simple audio file transcriber that uses the Google Cloud Speech API for transcription.
MIT License
26 stars 8 forks source link

Not transcribing long audio #5

Open Jorlejeu opened 5 years ago

Jorlejeu commented 5 years ago

Hey, it's me again.

Podcast-Transcriber is working fine with audio <40 seconds, but returning this error with longer audio : Transcribing [ 1 / 4 ] /Users/jordanl/Documents/podcast-transcriber-master/podcast_transcriber/tmp/tmp0US1Jn/Vocaroo_s0UA945unKq4004.raw ... Traceback (most recent call last): File "example.py", line 9, in <module> podcast_transcriber.transcribe(args.input_file) File "/Users/jordanl/Documents/podcast-transcriber-master/podcast_transcriber/podcast_transcriber.py", line 78, in transcribe transcript = transcriber.transcribe_many(chunks) File "/Users/jordanl/Documents/podcast-transcriber-master/podcast_transcriber/transcriber.py", line 66, in transcribe_many for alternatives in response['results']: KeyError: 'results'

I tested the transcriber with an audio of 44 seconds, it's working well (it split the audio in 2 and transcribe each cut well, and everything is fine in the output). However, with audio of 80 seconds and 120 seconds, I get this error. I don't really understand the transcribe_many function, any hint on how to resolve the issue ?

Thanks a lot.

agouil commented 5 years ago

@Jorlejeu something is wrong with the transcription - of one of the audio chunks - and the response doesn't include the right dictionary keys. Can you print and send the response you get from the Google API before the error occurs?