ros-ai / ros2_whisper

Whisper C++ Inference Action Server for ROS 2
29 stars 9 forks source link

Add support for continuous listening #12

Open RoboEvangelist opened 5 months ago

RoboEvangelist commented 5 months ago

Instead of pressing a key, continuously listen till the wake work is announced (i.e., "Hey Ross")

mhubii commented 5 months ago

great suggestion! Happy to review a PR @RoboEvangelist .

Currently, you can use the max_duration in the action message

https://github.com/ros-ai/ros2_whisper/blob/e5120805978478a8a33ad8bacf776b17eabb67fc/whisper_msgs/action/Inference.action#L5

to specify a duration.

Any idea how this could be handled better?

There are models for voice activity detection: https://github.com/snakers4/silero-vad

Once activity is detected, the action server could be called.