ttompk / voice_command

tf voice wake word
0 stars 0 forks source link

Tensorflow Wake Command Identification

The purpose of this project is to create a wake word on a Raspbery Pi Zero. The task that will be carried out upon wake word identification has not been determined but in the interim an LED light atached to a GPIO pin will flash when the wake word is identified.

The wake word will be identified from features extracted from real-time audio sampling using a 2-mic reSpeaker breakout board (Seeed Studio) attached to the raspbery pi. Inference of the digital audio input data, i.e. features, will be performed using a tensorflow model trained on a macbook pro which has been convereted to a tensoflow lite model (serialized) and installed on the Pi Zero.

Approximate Steps:

Hardware and Environment

My initial plan was to train a model using tensorflow on my mid-2010 MacBook Pro (16GB). I intended to use tensorflow 2.1 or greater as it had keras baked in and Google's reference docs utilized this version in their code. Immediately there was an issue loading tensorflow 2.1 or greater as a result of the laptop Intel chip not utilizing AVX. As there was no getting around this requirement I needed to shift development to a Colab notebook...because it's free. I could have spun up a machine on Google Cloud Compute...but that's not free and since I've never tried developing on Colab I thought I'd give it a shot.

Development Computer

Inference Machine

Software

Seeed Studios 2-mic respeaker

hat wiki: https://wiki.seeedstudio.com/ReSpeaker_2_Mics_Pi_HAT/
Since a raspberry pi zero lacks audio input and output capabilities (aside from HDMI), I needed to find a method to get our wake words to our model for inference. The 2-mic board is capable for both audio input and audio output. The manufacturer describes it as:

The 2-mic respeaker hat contains 2 mReSpeaker 2-Mics Pi HAT is a dual-microphone expansion board for Raspberry Pi designed for AI and voice applications.

The board fits GPIO pins on:

Installation of 2-mic respeaker on Raspberry Pi Zero W

These instructions assume you are running RaspberryOS Buster

Step 1

  1. Update/upgrade the Pi Zero.
    sudo apt-get update
    sudo apt-get upgrade
  2. Clone the seeed-voicecard repo:
    git clone https://github.com/respeaker/seeed-voicecard.git
  3. Run the install.sh bash script in the new repo
    sudo seeed-voicecard/install.sh
  4. Reboot
    sudo reboot

Step 2

  1. Verify the card has been installed and the playback device is detected by the Pi. Verify the output of this command matches the wiki.
    aplay -l
  2. Similarly, verify the audio input device is also listed (same wiki).
    arecord -l
  3. Test playback on the speaker. Can use headphones or plug a speaker to into the 2-pin JST PH (2.0mm pitch) connector. I've written two methods to play the sounds back, the first requires headphones. This method uses the microphones on the card to record a few seconds of sound and then plays that sound back on the headphones, in a record/play loop. If you run this comand with a speaker attached you will encounter a horribly loud shreeking sound due to the feedback between speaker and mics! The 'speaker' method requires testing the speaker and mic separately. First play back of a wav file in the home directory followed by a separate mic recording and playback.
    Warning!! Do not use the following command with speakers or your eardrums will burst!!
    • Headphones Only. You will need to replace the '1' in '-Dhw:1' with the number of the card from 'aplay -l' (_Ctrl+C to exit):
      arecord -f cd -Dhw:1 | aplay -Dhw:1
    • With speaker or headphones. You will need to replace the '1' in '-Dhw:1' with the correct output device number:
    • Record 4 seconds of sound through the mics then playback through the speakers, not simulatanouesly.
      arecord -f S16_LE -d 5 -r 16000 -Dhw:1 /tmp/test-mic.wav -c 2 && aplay -Dhw:1 /tmp/test-mic.wav
      Reference on playing tunes from command line in linux.

If the playback is faint or too loud, don't worry. We can adjust it using the built-in alsa mixer.

  1. Run the AlsaMixer from the command line:
    alsamixer
    Press F6 to select the 'seeed-voicecard' sound card. There are a ton of options the move around...but I'm not sure what they do...perhaps I'll look into it later to fine tune the input/output.

  2. There is a driver for the chip for the included LED. This code was provided:
    sudo pip install spidev
    cd ~/
    git clone https://github.com/respeaker/mic_hat.git
    cd mic_hat
    python pixels.py

LEDs worked fine running the pixel.py file.

  1. User Button: the on-board user button is connected to GPIO17.
    see sample code here to test button: ref

  2. Install pyaudio:
    Pyausio install failed. I was missing the app 'portaudio' on the pi zero. I was able to install it using the following:
    sudo apt-get install portaudio19-dev
    pip install pyaudio

Training Data

Google Command Dataset: The docs
Download directly

Feature Extraction

Ref docs for Mel Frequency Cepstral Coefficient: docs

Tensorflow Lite

Installing TensorFlow Lite to Pi Zero

To add tensorflow lite to the Pi Zero, it must be compiled natively on the Zero using CMake. I tried the instructions here without success. There are two methods to install TF Lite. Both are provided by separate TF lite documents but the first method did not work on my Pi Zero but I was hopeful it would.

Try this statement:
PATH=../rpi_tools/arm-bcm2708/arm-rpi-4.9.3-linux-gnueabihf/bin:$PATH \ ./tensorflow/lite/tools/make/build_rpi_lib.sh TARGET_ARCH=armv6

The above statement does not work - the installer is referencing '-DTFLITE_WITHOUT_XNNPACK -march=armv7-a' and similar statements like '/home/pi/tensorflow_src/tensorflow/lite/tools/make/gen/rpi_armv7l' which appears to be arm v7, whereas the Pi Zero is arm v6.

ARMCC_PREFIX=${HOME}/toolchains/arm-rpi-linux-gnueabihf/x64-gcc-6.5.0/arm-rpi-linux-gnueabihf/bin/arm-rpi-linux-gnueabihf-
ARMCC_FLAGS="-march=armv6 -mfpu=vfp -funsafe-math-optimizations"
(The following does not work. Requires CMake 3.16)
cmake -DCMAKE_C_COMPILER=${ARMCC_PREFIX}gcc \
-DCMAKE_CXX_COMPILER=${ARMCC_PREFIX}g++ \
-DCMAKE_C_FLAGS="${ARMCC_FLAGS}" \
-DCMAKE_CXX_FLAGS="${ARMCC_FLAGS}" \
-DCMAKE_VERBOSE_MAKEFILE:BOOL=ON \
-DCMAKE_SYSTEM_NAME=Linux \
-DCMAKE_SYSTEM_PROCESSOR=armv6 \
-DTFLITE_ENABLE_XNNPACK=OFF \
../tensorflow/lite/

Method 2 will not work because the code requires CMake version 3.16 whereas the 'proper' CMake version for RPi OS Buster is 3.13.

Installing CMake v3.16.1 - DOES NOT WORK

Raspberry Pi OS Buster comes with 3.13 but apt-get upgrade will not update to 3.16. In order to upgrade so you must do this:

  1. CMake requires OpenSSL and cannot find it unless you run the following:
    sudo apt-get install libssl-dev

  2. Run the following commands to install CMake:

    • Step 1
      version=3.16
      build=1
      mkdir ~/temp
      cd ~/temp
      wget https://cmake.org/files/v$version/cmake-$version.$build.tar.gz tar -xzvf cmake-$version.$build.tar.gz cd cmake-$version.$build/

    • Step 2
      ./bootstrap
      make -j$(nproc)
      sudo make install
      cmake --version

/home/pi/temp/cmake-3.16.1

Running Inference on TF Lite

This reference is a good starting point:
https://www.tensorflow.org/lite/guide/inference

  1. Train model on TF (colab,etc)
  2. Convert model to .tflite model
  3. load model on Pi Zero
  4. Transform data
  5. Run inference
  6. Interpret output

The file 'label_image.py' in this repo is an example of how to perform sSteps 3-6 above. Requires there be metadata associated with the model or creating the model data prior to transforming data as shown in the example.