jeff1evesque / LeQue

Activate installed microphone, and analyze sound input
13 stars 2 forks source link

Ensure audio recording is 16kHz, 16bit, mono #31

Closed jeff1evesque closed 10 years ago

jeff1evesque commented 10 years ago

We must change the default recordings from 44,100 Hz to 16,000 Hz since this is the specification from CMUSphinx.

Speech recordings (wav files) Recording files must be in MS WAV format with specific sample rate - 16 kHz, 16 bit, mono for desktop application, 8kHz, 16bit, mono for telephone applications.

jeff1evesque commented 10 years ago

To force this issue, we will watch the audio-analyzer/audio/ directory with inotify. If any audio files are created or modified we will convert their format to 16 kHz, 16 bit, mono:

ffmpeg -i audio.wav -acodec pcm_s16le -ac 1 -ar 16000 out.wav

We scan the audio recording directory with inotify because we find it more reliable then php (i.e. exec()) for multiple reasons. Specifically:

Inotify (inode notify) is a Linux kernel subsystem that acts to extend filesystems to notice changes to the filesystem, and report those changes to applications.

This means, when apache2 hiccups for any infinite reasons, our file will still be converted via our script. We may wonder another case - if php hiccups, does our audio file get created? Well, in the case that it doesn't, then inotify will not run the necessary bash script for format conversion on a non-existing audio wav file.

jeff1evesque commented 10 years ago

0e886e5: We detect new files incorrectly in the /audio-analyzer/bash/audio directory with a bash script.

If a new file is created, chances are it was a wav file from our application. However, we aren't guaranteed that it has the required settings of 16 kHz, 16 bit, mono (specified above). So, we force if with the following line:

ffmpeg -i "$file" -acodec pcm_s16le -ac 1 -ar 16000 "$filename".wav

Note: prior to the above line, we had the following:

filename="${file##*/}"

which is used to get the filename, or rather remove any file-extensions if they exist.

jeff1evesque commented 10 years ago

f7a2271: we add the flag -e modify to the file audio_converter_rate to detect whether existing files have been modified. If so, convert it as we did above:

ffmpeg -i "$file" -acodec pcm_s16le -ac 1 -ar 16000 "$filename".wav
jeff1evesque commented 10 years ago

31ec603: we could attempt to recursively scan everything within audio-analyzer/audio directory -- but, we decided to scan specifically our recordings. Thus, we changed the directory path to scan from ..audio to ../audio/recording in the bash file audio-analyzer/bash/audio_converter_rate.

jeff1evesque commented 10 years ago

a928768: we made sure we could execute the audio_converter_rate bash file with chmod u+x audio_converter_rate

jeff1evesque commented 10 years ago

The two commits - 9256cbb, 8deb761 belong in https://github.com/jeff1evesque/audio-analyzer/issues/34#issuecomment-45053659.

jeff1evesque commented 10 years ago

We are not properly converting the audio, as indicated in https://github.com/jeff1evesque/audio-analyzer/issues/232#issuecomment-47458289. Therefore, we need to investigate on how to modify the conversion process of converter_wav_rate.

jeff1evesque commented 10 years ago

65333c0, d30f3ef: we ensure audio files are 16 bit, 16 kHz, mono.