PCM Opus needed for Amazon Lex

jbosnjakovic commented 7 years ago

Does anyone have any experience using this library for communication with Amazon Lex? I see guys who are developing the applications for IBM's Watson also have the same issue. Did you find a solution or workaround on how to record lossless formats with this library?

I am limited on the following formats:

PCM format, audio data must be in little-endian byte order.
- audio/l16; rate=16000; channels=1
- audio/x-l16; sample-rate=16000; channel-count=1
- audio/lpcm; sample-rate=8000; sample-size-bits=16; channel-count=1; is-big-endian=false
Opus format
- audio/x-cbr-opus-with-preamble; preamble-size=0; bit-rate=256000; frame-size-milliseconds=4

I would appreciate any advice.

jaypeng2015 commented 6 years ago

Tried to use this format audio/lpcm; sample-rate=8000; sample-size-bits=16; channel-count=1; is-big-endian=falseon IOS, by settings the prepareRecordingAtPath options to

{
      SampleRate: 8000,
      Channels: 1,
      AudioQuality: 'Low',
      AudioEncoding: 'lpcm',
      IncludeBase64: true,
    }

Then found that there's no way to set sample-size-bits. So tried locally to modify AudioRecorderManager.m:

## define
NSNumber  *_audioBitDepth; 
_audioBitDepth = [NSNumber numberWithInt:16];

## set
NSDictionary *recordSettings = [NSDictionary dictionaryWithObjectsAndKeys:
          _audioQuality, AVEncoderAudioQualityKey,
          _audioEncoding, AVFormatIDKey,
          _audioChannels, AVNumberOfChannelsKey,
          _audioSampleRate, AVSampleRateKey,
          _audioBitDepth, AVEncoderBitDepthHintKey,
          _audioBitDepth, AVLinearPCMBitDepthKey,
          nil];

The last step is to send the data:

AudioRecorder.onFinished = async data => {
const params = {
      botAlias: '$LATEST',
      botName: 'TheBotName',
      inputStream: Buffer.from(data, 'base64'),
      userId: lexUserId,
      contentType: 'audio/lpcm; sample-rate=8000; sample-size-bits=16; channel-count=1; is-big-endian=false',
      accept: 'audio/mpeg',
    };
}
const lexResponse = await lexRunTime.postContent(params).promise()
...

Then I got it working.

So this will need to update prepareRecordingAtPath to add a property bitDepth into the options.

ijemmy commented 6 years ago

@jaypeng2015 Thanks a lot, you saved me several days ! It seems like the bit depth is 16 by default. So we don't have to change the AudioRecorderManager.m

https://stackoverflow.com/questions/22710791/avaudiorecorder-default-record-setting

For those who got stucked, the info below might help

Include "buffer": "^5.0.6", in package.json and require it with const Buffer = require('buffer').Buffer;
If you copy the example, don't forget to change the file from test.acc to test.lpcm.

jaypeng2015 commented 6 years ago

Yeah true, I must did something else wrong to make think bit depth was the problem, but anyway this config works without changing anything:

// before calling AudioRecorder.startRecording
const audioPath = AudioUtils.DocumentDirectoryPath + '/test.lpcm';
AudioRecorder.prepareRecordingAtPath(audioPath, {
        SampleRate: 8000,
        Channels: 1,
        AudioQuality: 'High',
        AudioEncoding: 'lpcm',
        IncludeBase64: true,
      });

Thanks @ijemmy

ijemmy commented 6 years ago

@jaypeng2015 You're welcome. Do you find a good way to play the returned audioStream from Lex? All react-native libraries I found either play from a file or URL.

The best work around I could find now is to embed <audio> from an HTML page in a WebView. Then I pass the audio data via .postMessage(). I wonder if there is any less hacky option.

jaypeng2015 commented 6 years ago

@ijemmy I use the following way to play the sound, basically saving the audio to local file system and before playing:

import RNFS from 'react-native-fs';
import Sound from 'react-native-sound';

...

if (lexResponse.audioStream) {
      const path = `${RNFS.DocumentDirectoryPath}/test.mp3`;
      const data = Buffer.from(lexResponse.audioStream).toString('base64');
      await RNFS.writeFile(path, data, 'base64');
      const speech = new Sound(path, '', err => {
        if (!err) speech.play(() => speech.release());
        else console.log('Pay sound error', err);
      });
    }

...

ijemmy commented 6 years ago

@jaypeng2015 I see. Thanks a lot for sharing :)

jsierles / react-native-audio

PCM Opus needed for Amazon Lex #213