I've encountered an issue with Rhasspy 3 when using multi-channel audio input, specifically when more than one channel is used (e.g., arecord -c 4). In these cases, the wake word detection becomes unresponsive, and the application seems to hang during audio processing.
Steps to Reproduce
Configure Rhasspy 3 to use an audio input source with multiple channels (e.g., using arecord with -c 4 for four channels).
Attempt to trigger the wake word.
Expected Behavior
Rhasspy should process multi-channel audio inputs correctly, either by internally converting them to a single channel for processing or by handling multi-channel data without issues.
Actual Behavior
When multi-channel audio is used:
The wake word detection becomes unresponsive.
The application hangs or delays significantly during audio processing.
Questions and Concerns
Is this behavior expected due to current limitations in Rhasspy 3's handling of multi-channel audio?
If it's an unintended issue, what might be the best approach to resolve it? For instance, should there be an internal mechanism to convert multi-channel input to mono before processing, or should Rhasspy be enhanced to handle multi-channel audio natively?
Any guidance or suggestions on how to configure Rhasspy 3 for multi-channel audio inputs would be greatly appreciated.
Additional Context
The issue seems to revolve around how Rhasspy 3 interacts with multi-channel audio inputs and its impact on subsequent processing stages, particularly wake word detection.
Any insights or assistance on this matter would be highly valuable.
Description
I've encountered an issue with Rhasspy 3 when using multi-channel audio input, specifically when more than one channel is used (e.g.,
arecord -c 4
). In these cases, the wake word detection becomes unresponsive, and the application seems to hang during audio processing.Steps to Reproduce
arecord
with-c 4
for four channels).Expected Behavior
Rhasspy should process multi-channel audio inputs correctly, either by internally converting them to a single channel for processing or by handling multi-channel data without issues.
Actual Behavior
When multi-channel audio is used:
Questions and Concerns
Additional Context
The issue seems to revolve around how Rhasspy 3 interacts with multi-channel audio inputs and its impact on subsequent processing stages, particularly wake word detection.
Any insights or assistance on this matter would be highly valuable.
Thank you for your attention to this issue.