A FreeSWITCH module that streams L16 audio from a channel to a websocket endpoint. If websocket sends back responses (eg. JSON) it can be effectively used with ASR engines such as IBM Watson etc., or any other purpose you find applicable.
mod_audio_stream
was to make a simple, less dependent but yet effective module to stream audio and receive responses from websocket server. It uses ixwebsocket, c++ library for websocket protocol which is compiled as a static library.It requires libfreeswitch-dev
, libssl-dev
, zlib1g-dev
and libspeexdsp-dev
on Debian/Ubuntu which are regular packages for Freeswitch installation.
After cloning please execute: git submodule init and git submodule update to initialize the submodule.
If you built FreeSWITCH from source, eq. install dir is /usr/local/freeswitch, add path to pkgconfig:
export PKG_CONFIG_PATH=/usr/local/freeswitch/lib/pkgconfig
To build the module, from the cloned repository directory:
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make
sudo make install
sudo apt-get -y install git \
&& cd /usr/src/ \
&& git clone https://github.com/amigniter/mod_audio_stream.git \
&& cd mod_audio_stream \
&& sudo bash ./build-mod-audio-stream.sh
The following channel variables can be used to fine tune websocket connection and also configure mod_audio_stream logging:
Variable | Description | Default |
---|---|---|
STREAM_MESSAGE_DEFLATE | true or 1, disables per message deflate | off |
STREAM_HEART_BEAT | number of seconds, interval to send the heart beat | off |
STREAM_SUPPRESS_LOG | true or 1, suppresses printing to log | off |
STREAM_BUFFER_SIZE | buffer duration in milliseconds, divisible by 20 | 20 |
STREAM_EXTRA_HEADERS | JSON object for additional headers in string format | none |
true|1
.true|1
. Events are fired still, it only affects printing to the log.Buffer Size
actually represents a duration of audio chunk sent to websocket. If you want to send e.g. 100ms audio packets to your ws endpoint
you would set this variable to 100. If ommited, default packet size of 20ms will be sent as grabbed from the audio channel (which is default FreeSWITCH frame size)
{
"Header1": "Value1",
"Header2": "Value2",
"Header3": "Value3"
}
The freeswitch module exposes the following API commands:
uuid_audio_stream <uuid> start <wss-url> <mix-type> <sampling-rate> <metadata>
Attaches a media bug and starts streaming audio (in L16 format) to the websocket server. FS default is 8k. If sampling-rate is other than 8k it will be resampled.
uuid
- Freeswitch channel unique idwss-url
- websocket url ws://
or wss://
mix-type
- choice of
sampling-rate
- choice of
metadata
- (optional) a valid utf-8
text to send. It will be sent the first before audio streaming starts.uuid_audio_stream <uuid> send_text <metadata>
Sends a text to the websocket server. Requires a valid utf-8
text.
uuid_audio_stream <uuid> stop <metadata>
Stops audio stream and closes websocket connection. If metadata is provided it will be sent before the connection is closed.
uuid_audio_stream <uuid> pause
Pauses audio stream
uuid_audio_stream <uuid> resume
Resumes audio stream
Module will generate the following event types:
mod_audio_stream::json
mod_audio_stream::connect
mod_audio_stream::disconnect
mod_audio_stream::error
mod_audio_stream::play
Message received from websocket endpoint. Json expected, but it contains whatever the websocket server's response is.
Name: mod_audio_stream::json Body: WebSocket server response
Successfully connected to websocket server.
Name: mod_audio_stream::connect Body: JSON
{
"status": "connected"
}
Disconnected from websocket server.
Name: mod_audio_stream::disconnect Body: JSON
{
"status": "disconnected",
"message": {
"code": 1000,
"reason": "Normal closure"
}
}
<int>
<string>
There is an error with the connection. Multiple fields will be available on the event to describe the error.
Name: mod_audio_stream::error Body: JSON
{
"status": "error",
"message": {
"retries": 1,
"error": "Expecting status 101 (Switching Protocol), got 403 status connecting to wss://localhost, HTTP Status line: HTTP/1.1 403 Forbidden\r\n",
"wait_time": 100,
"http_status": 403
}
}
<int>
, error: <string>
, wait_time: <int>
, http_status: <int>
Name: mod_audio_stream::play Body: JSON
Websocket server may return JSON object containing base64 encoded audio to be played by the user. To use this feature, response must follow the format:
{
"type": "streamAudio",
"data": {
"audioDataType": "raw",
"sampleRate": 8000,
"audioData": "base64 encoded audio"
}
}
<raw|wav|mp3|ogg>
Event generated by the module (subclass: _mod_audiostream::play) will be the same as the data
element with the file added to it representing filePath:
{
"audioDataType": "raw",
"sampleRate": 8000,
"file": "/path/to/the/file"
}
If printing to the log is not suppressed, response
printed to the console will look the same as the event. The original response containing base64 encoded audio is replaced because it can be quite huge.
All the files generated by this feature will reside at the temp directory and will be deleted when the session is closed.