qasim9872 / aws-transcribe

A client for Amazon Transcribe using the websockets API
https://www.npmjs.com/package/aws-transcribe
MIT License
10 stars 15 forks source link

Transcription missing end of the file #19

Open agonza1 opened 4 years ago

agonza1 commented 4 years ago

First, I would like to thank you for setting this module. It works great! I was just having some issues with voice transcriptions at the end of the file. So I am streaming from a local file, very similar to what you did here . After using:

const transcribeStream = client
    .createStreamingClient()

And

fs.createReadStream(audioFile).pipe(new throttle(sampleRate)).pipe(transcribeStream);

I can get a transcript but it is always missing the end part of the transcription, (e.g: I say: "hello world", transcription returns: "hello" and ends). I just uploaded an example wav audio here: example_audio.zip. This file gets transcribed fine when using batch transcriptions instead of aws-transcribe.

qasim9872 commented 4 years ago

Hi @agonza1

Can you view this issue for details? we had a similar issue before and the way it was fixed is mentioned in it. You can find it here

Aung-Myint-Thein commented 4 years ago

Hello.. I used the example_audio.zip and I get only "hello" too.

I am trying to transcribe a file with this library too. The file I am using is following. example_2.zip

I am not getting any returns for my file thou. Would you mind to take a look at it and suggest if I am missing anything? Followings are the settings I used.

const sampleRate = 8000; 
...
.createStreamingClient({
        region: "ap-southeast-2",
        sampleRate,
        languageCode: "en-US",
})

fs.createReadStream(path.join(__dirname, 'file_name.wav')).pipe(new Throttle(16000)).pipe(transcribeStream);

By the way, I found a more elegant way to end the streaming of the file. I will comment in the library.

agonza1 commented 4 years ago

Hi @agonza1

Can you view this issue for details? we had a similar issue before and the way it was fixed is mentioned in it. You can find it here

I found a small hack that seems to solve the issue, it is similar to what you did in the issue you linked. I just concatenated 1s silence at the end of file. I used something like:

await childProcessPromise.spawn(
          '/opt/bin/ffmpeg',
          ['-loglevel','error','-i', inputTempFileName, '-vn','-ac', '1','-filter_complex','aevalsrc=0:d=1[silence];[0:a][silence]concat=n=2:v=0:a=1[out]','-map','[out]',outputTempFileName],
          {env: process.env}
        );

I believe the issue could come from the real time transcribe API. If the audio suddenly ends just after a word, the last sentence being transcribed is never returned.

Aung-Myint-Thein commented 4 years ago

Hi @agonza1 , what is the sample rate that you are using? Is it the sample rate from the file? Mind to explain why you choose sample rate for throttle but the example used 2 x sample rate.. I am still getting empty results or bit and pieces of wrong transcribe..

agonza1 commented 4 years ago

Hi @agonza1 , what is the sample rate that you are using? Is it the sample rate from the file? Mind to explain why you choose sample rate for throttle but the example used 2 x sample rate.. I am still getting empty results or bit and pieces of wrong transcribe..

In my case I had a sample rate of 16khz so I didn't need to put the 2 in front. I believe 16khz is the max supported so that's why you need to throttle. If you are not getting any transcriptions from your files I would verify the file is audio only and with the right codecs, etc. You could use something like:

        const outputAudio = await childProcessPromise.spawn(
          '/opt/bin/ffprobe',
          ['-i', fileName, '-show_streams', '-select_streams', 'a', '-of', 'json', '-loglevel', 'error'],
          {env: process.env}
        )

to check it out