EARS is a proof of concept implementation of a convolutional neural network for live environmental audio processing & recognition on low-power SoC devices (at this time it has been developed and tested on a Raspberry Pi 3 Model B).
EARS features a background thread for audio capture & classification and a Bokeh server based dashboard providing live visualization and audio streaming from the device to the browser.
EARS is quite taxing on the CPU, so some proper cooling solution (heatsink) is advisable. Nevertheless, when not using the Bokeh app too much, it should work fine even without one.
The live audio stream can get choppy or out-of-sync, especially when using the mute/unmute button.
Actual production deployments would profit from a server-node architecture where SoC devices are only pushing predictions, status updates and audio feeds to a central server handling all end user interaction, material browsing and visualization. This may be implemented in future versions, but no promises here.
EARS has been developed and tested on a Raspberry Pi 3 Model B device. To recreate the environment used for developing this demo:
user: pi, password: raspberry
).sudo raspi-config
to enable SSH.sudo rm /etc/ssh/ssh_host_*
sudo dpkg-reconfigure openssh-server
/opt/conda
:wget http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-armv7l.sh
chmod +x Miniconda3-latest-Linux-armv7l.sh
sudo ./Miniconda3-latest-Linux-armv7l.sh
Add export PATH="/opt/conda/bin:$PATH"
to the end of /home/pi/.bashrc
. Then reload with source /home/pi/.bashrc
.
Install Python with required packages:
conda config --add channels rpi
conda create -n ears python=3.6
source activate ears
conda install cython numpy pandas scikit-learn cffi h5py
sudo apt-get install portaudio19-dev
/home/pi/ears
. Then install the required packages by issuing:pip install -r /home/pi/ears/requirements.txt
python -m sounddevice
.--allow-websocket-origin
option inside /home/pi/ears/run.sh
file with the IP address of the Raspberry Pi device.chmod +x /home/pi/ears/run.sh
cd /home/pi/ears
./run.sh
http://RASPBERRY_PI_IP:5006/
For the time being, EARS comes preloaded with a very rudimentary model trained on the ESC-50 dataset (convnet consisting of 3 layers, 3x3 square filters), so it's recognition capabilities are limited for actual live scenarios.
If you want to train the same model on a different dataset:
ears/dataset/audio
.ears/dataset/dataset.csv
file with new CSV:filename,category
python train.py
- this should result in the following files being generated on the server:File | Description |
---|---|
model.h5 |
weights of the learned model |
model.json |
a serialized architecture of the model (Keras >=2.0.0) |
model_labels.json |
dataset labels |
If you want to train a completely different model, then you can have a look at train.py
. In this case you probably know what to do either way.
MIT © Karol J. Piczak