Closed yodakohl closed 6 years ago
thanks, @yodakohl. Appreciate your contribution. We should run the benchmark again with this config. thankfully we are about to start a new run and can incorporate this into it. let's merge when the result is out as I want the result and code to be in sync.
Quick question ... Did you have a chance to run Snowboy with ApplyFrontend(True) on RPi zero or RPi3? My understanding is that this flag enables a few audio preprocessing routines before wake-word detection and I am wondering what is the effect on runtime metrics such as CPU/memory consumption. Thanks in advance.
I did some quick test but its a bit difficult to measure since Snowboy does voice activity detection. I only tested on the Pi ZeroW since I currently have no Pi3 running.
Frontend False Idle
%Cpu(s): 5.0 us, 2.0 sy, 0.0 ni, 92.6 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem : 378936 total, 38228 free, 39856 used, 300852 buff/cache
KiB Swap: 102396 total, 102140 free, 256 used. 277904 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8061 pi 20 0 35992 12984 7640 S 4.9 3.4 0:06.05 python
8063 pi 20 0 5608 2500 2232 S 2.6 0.7 0:00.98 arecord
Frontend False Voice
%Cpu(s): 44.2 us, 1.3 sy, 0.0 ni, 54.2 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
KiB Mem : 378936 total, 38724 free, 39360 used, 300852 buff/cache
KiB Swap: 102396 total, 102140 free, 256 used. 278400 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8061 pi 20 0 35992 13036 7640 S 41.9 3.4 0:09.30 python
8063 pi 20 0 5608 2500 2232 S 1.9 0.7 0:02.15 arecord
Frontend True Idle
%Cpu(s): 14.6 us, 1.0 sy, 0.0 ni, 84.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 378936 total, 38724 free, 39360 used, 300852 buff/cache
KiB Swap: 102396 total, 102140 free, 256 used. 278400 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8121 pi 20 0 35852 12732 7504 S 12.1 3.4 0:02.70 python
8123 pi 20 0 5608 2552 2284 S 2.6 0.7 0:00.43 arecord
Frontend True Voice
%Cpu(s): 70.4 us, 2.0 sy, 0.0 ni, 27.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 378936 total, 38304 free, 39780 used, 300852 buff/cache
KiB Swap: 102396 total, 102140 free, 256 used. 277980 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8131 pi 20 0 35824 13064 7840 S 66.8 3.4 0:08.63 python
8134 pi 20 0 5608 2564 2296 S 2.0 0.7 0:00.52 arecord
It seems like memory is unchanged but CPU jumped from 5% to 12% in idle and from 42% to 67% during voice activity. I used the arecord example in python:
python demo_arecord.py resources/alexa/alexa-avs-sample-app/alexa.umdl
The change I made:
+++ b/examples/Python/snowboydecoder_arecord.py
@@ -75,6 +75,7 @@ class HotwordDetector(object):
resource_filename=resource.encode(), model_str=model_str.encode())
self.detector.SetAudioGain(audio_gain)
self.num_hotwords = self.detector.NumHotwords()
+ self.detector.ApplyFrontend(True)
Thanks a lot. This is really helpful. This is consistent with what we are measuring. I am going to add a script to allow for measuring this systematically in a reproducible fashion. I probably will use real-time factor instead of CPU usage. Thoughts?
After looking into Snowboy's documentation, I decided that in order to have an apple-to-apple comparison we shouldn't set this flag for a couple of reasons: (1) the flag essentially enables few audio preprocessing algorithms (e.g. automatic gain control and noise suppression) that are considered separate modules in the audio processing chain and not part of wake-word engine functionality. (2) this flag is only recommended for (some) universal models and not to be set for personal models. Thank you for looking into it.
According to the Snowboy Readme ApplyFrontend should be set to True if resources/alexa/alexa-avs-sample-app/alexa.umdl is used.