rochars / wavefile

Create, read and write wav files according to the specs. :star: :notes: :heart:
MIT License
226 stars 48 forks source link

Issue converting Twilio Media Stream (mulaw) to PCM for @aws-sdk/client-transcribe-streaming #37

Closed ajporterfield closed 1 year ago

ajporterfield commented 1 year ago

Hello, and thanks for this library! I'm trying to use it to convert audio coming in from Twilio so it can be transcribed by Amazon Transcribe. Unfortunately, I'm getting back empty responses from AWS (ex: { TranscriptEvent: { Transcript: { Results: [] } } }). Here's the transform stream I'm using for the conversion.

new stream.Transform({
  transform(chunk, encoding, done) {
    const wav = new WaveFile();
    wav.fromScratch(1, 8000, '8m', Buffer.from(chunk, 'base64'));
    wav.fromMuLaw();
    this.push(Buffer.from(wav.data.samples));
    done();
  }
});

The audio from Twilio is mulaw encoded. AWS has this on their best practices webpage for supported audio "PCM (only signed 16-bit little-endian audio formats, which does not include WAV)". I'm assuming that's exactly what I get when running wav.fromMuLaw(), correct? Should I try using wav.getSamples(false, Int16Array) instead of wav.data.samples? Any other things I should try?

Thanks for your help.

ajporterfield commented 1 year ago

I was able to get this working with a very slight variation of the snipped I pasted above.

new stream.Transform({
  transform(chunk, encoding, done) {
    const wav = new WaveFile();
    wav.fromScratch(1, 8000, '8m', chunk);
    wav.fromMuLaw();
    this.push(Buffer.from(wav.data.samples));
    done();
  }
});

I'm writing to the transform stream like this.

transformStream.write(Buffer.from(payload, 'base64'))