alphacep / vosk-server

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Apache License 2.0
935 stars 249 forks source link

"Incorrect Masking" Error in WebSocket Communication with Vosk Server #243

Open alvarogiron opened 10 months ago

alvarogiron commented 10 months ago

I’m reaching out regarding a recurring issue with the Vosk server when interfacing with FreeSWITCH’s mod_vosk. We have a setup where FreeSWITCH with mod_vosk is on one EC2 instance, and the Vosk server on another, both using Debian. Despite following your setup instructions, we're stuck with an "incorrect masking" error in WebSocket communications.

Here's the situation:

Real-time transcription works fine most of the time. We often run into an "incorrect masking" error during WebSocket communication. What's puzzling is that this doesn't always happen; sometimes the WebSocket connection stays up without any issues.

Error: ERROR:websockets.server:Error in connection handler Traceback (most recent call last): File "/usr/lib/python3/dist-packages/websockets/protocol.py", line 827, in transfer_data message = await self.read_message() File "/usr/lib/python3/dist-packages/websockets/protocol.py", line 895, in read_message frame = await self.read_data_frame(max_size=self.max_size) File "/usr/lib/python3/dist-packages/websockets/protocol.py", line 971, in read_data_frame frame = await self.read_frame(max_size) File "/usr/lib/python3/dist-packages/websockets/protocol.py", line 1047, in read_frame frame = await Frame.read( File "/usr/lib/python3/dist-packages/websockets/framing.py", line 116, in read raise ProtocolError("incorrect masking") websockets.exceptions.ProtocolError: incorrect masking

Steps we've taken:

Compiled and installed a custom libks library from Alphacep’s vosk-fix branch, aimed at addressing WebSocket issues. We made sure mod_vosk is linked to this custom libks, which we verified using the ldd command.

Our Vosk server is set up using the standard Docker command: docker run -d -p 2700:2700 alphacep/kaldi-en:latest

Despite these measures, the "incorrect masking" error keeps cropping up. It's hit or miss – sometimes it occurs, other times everything works smoothly. Any advice or suggestions you can provide to get past this would be appreciated. Let me know if you need logs or more details.

nshmyrev commented 10 months ago

Its likely you have another module which links to default libks and sometimes it loads first. You can add the debug print statement to alphacep libks library to see that it is actually used.

alvarogiron commented 10 months ago

Thank you for your quick reply and the guidance provided. Despite following your suggestions, I'm still experiencing the "incorrect masking" issue in the Vosk server, which occurs after about 30 seconds of real-time transcription using the mod_vosk module in FreeSWITCH.

Here's an overview of the steps I've taken:

  1. Updated libks Library: Cloned from the vosk-fix branch at https://github.com/alphacep/libks (using git clone --branch vosk-fix --single-branch https://github.com/alphacep/libks). Added a logging function in the ks_pool.c file, within the ks_pool_open function, to log activities to a file.

  2. Recompiled libks: Confirmed the logging functionality is working, as evidenced by entries in the log file.

  3. Recompiled FreeSWITCH: Using the version cloned from https://github.com/alphacep/freeswitch.git. The compilation steps were:

    cd /home/admin/freeswitch make clean ./configure --with-lua=/usr/include/lua5.1 make make install

  4. Verified libks Linkage: ldd on the mod_vosk module confirms the correct linkage to the updated libks library.

Despite these steps, the issue persists. I'm using Debian 10 Buster for this setup. Interestingly, no new entries appear in the libks log file when the "incorrect masking" error occurs, indicating that the mod_vosk module might not be utilizing the libks vosk-fix library when the error happens.

Could you provide any further advice or specific build instructions to ensure the mod_vosk module in FreeSWITCH consistently utilizes the libks vosk-fix library? It's crucial to verify that the correct version of the library is being used to effectively address this issue.

Thank you once again for your assistance and looking forward to any additional guidance you can offer.

nshmyrev commented 10 months ago

You can give me shell access to your server, I will take a look

alvarogiron commented 10 months ago

Thank you for offering to take a look at the server issue. I agree that direct access might be the most efficient way to diagnose the problem. For security reasons, we need to share the access credentials privately. Could you please provide a secure method of contact or suggest a preferred way for me to send you these details securely? Thank you for your understanding and assistance.

Al

On Jan 16, 2024, at 5:33 PM, Nickolay V. Shmyrev @.***> wrote:

You can give me shell access to your server, I will take a look

— Reply to this email directly, view it on GitHub https://github.com/alphacep/vosk-server/issues/243#issuecomment-1894628856, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK7UTRCBS7TE6G75LKMYEDTYO353TAVCNFSM6AAAAABB3PRXXCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJUGYZDQOBVGY. You are receiving this because you authored the thread.

alvarogiron commented 10 months ago

I've made some progress on the issue. After compiling FreeSWITCH in debug mode with the following configuration command:

./configure --with-lua=/usr/include/lua5.1 --enable-core-pgsql-support CFLAGS="-g -O0"

I can confirm that I now see the log entries from the custom logging functionality I added in libks. This suggests that the debug build of FreeSWITCH is indeed using the modified libks library. However, the "Incorrect Masking" error within the WebSocket communication with the Vosk server still occurs.

nshmyrev commented 10 months ago

Could you please provide a secure method of contact or suggest a preferred way for me to send you these details securely

nshmyrev@gmail.com

alvarogiron commented 10 months ago

Thank you for checking the setup. Based on your observation about the "incorrect masking" message being possibly related to the way libks finalizes connections on close, I conducted further testing. I discovered an issue with the NAT configuration in vars.xml which was causing a SIP ACK timeout. After correcting the NAT setup, the system successfully passed the 30-second mark without the "incorrect masking" error.

However, I've encountered a new error in FreeSWITCH that consistently appears after receiving transcription results from the Vosk server:

[ERR] switch_ivr_async.c:4761 Event queue failed!

This error is logged immediately following the transcription data responses and begins to occur around the 1.5 to 2-minute mark of ongoing transcription activity. Despite the error message, the transcription service continues to function and return partial results. The regularity of the error, following every data exchange with Vosk, suggests an issue with event handling in FreeSWITCH's IVR system. Here's a snippet of the log with the error marked:

2024-01-18 11:56:45.618584 [DEBUG] mod_vosk.c:140 Sending data 3200
2024-01-18 11:56:45.638579 [DEBUG] mod_vosk.c:164 Recieved 20 bytes:
{
  "partial" : ""
}
2024-01-18 11:56:45.818635 [DEBUG] mod_vosk.c:140 Sending data 3200
2024-01-18 11:56:45.898584 [DEBUG] mod_vosk.c:164 Recieved 20 bytes:
{
  "partial" : ""
}
2024-01-18 11:56:46.018584 [DEBUG] mod_vosk.c:140 Sending data 3200
2024-01-18 11:56:46.078588 [DEBUG] mod_vosk.c:164 Recieved 22 bytes:
{
  "partial" : "to"
}
2024-01-18 11:56:46.078588 [ERR] switch_ivr_async.c:4761 Event queue failed! 🔴
2024-01-18 11:56:46.218593 [DEBUG] mod_vosk.c:140 Sending data 3200
2024-01-18 11:56:46.298581 [DEBUG] mod_vosk.c:164 Recieved 23 bytes:
{
  "partial" : "two"
}
2024-01-18 11:56:46.298581 [ERR] switch_ivr_async.c:4761 Event queue failed! 🔴
2024-01-18 11:56:46.418589 [DEBUG] mod_vosk.c:140 Sending data 3200
2024-01-18 11:56:46.478583 [DEBUG] mod_vosk.c:164 Recieved 31 bytes:
{
  "partial" : "two minutes"
}
2024-01-18 11:56:46.478583 [ERR] switch_ivr_async.c:4761 Event queue failed! 🔴
2024-01-18 11:56:46.618587 [DEBUG] mod_vosk.c:140 Sending data 3200
2024-01-18 11:56:46.678584 [DEBUG] mod_vosk.c:164 Recieved 35 bytes:
{
  "partial" : "two minutes and"
}
2024-01-18 11:56:46.678584 [ERR] switch_ivr_async.c:4761 Event queue failed! 🔴
2024-01-18 11:56:46.818582 [DEBUG] mod_vosk.c:140 Sending data 3200
2024-01-18 11:56:46.838586 [DEBUG] mod_vosk.c:164 Recieved 35 bytes:
{
  "partial" : "two minutes and"
}
2024-01-18 11:56:46.838586 [ERR] switch_ivr_async.c:4761 Event queue failed! 🔴
2024-01-18 11:56:47.018589 [DEBUG] mod_vosk.c:140 Sending data 3200
2024-01-18 11:56:47.078584 [DEBUG] mod_vosk.c:164 Recieved 40 bytes:
{
  "partial" : "two minutes and five"
}
2024-01-18 11:56:47.078584 [ERR] switch_ivr_async.c:4761 Event queue failed! 🔴
2024-01-18 11:56:47.218592 [DEBUG] mod_vosk.c:140 Sending data 3200
2024-01-18 11:56:47.278584 [DEBUG] mod_vosk.c:164 Recieved 40 bytes:
{
  "partial" : "two minutes and five"
}
2024-01-18 11:56:47.278584 [ERR] switch_ivr_async.c:4761 Event queue failed! 🔴
2024-01-18 11:56:47.418584 [DEBUG] mod_vosk.c:140 Sending data 3200
2024-01-18 11:56:47.478590 [DEBUG] mod_vosk.c:164 Recieved 40 bytes:
{
  "partial" : "two minutes and five"
}
2024-01-18 11:56:47.478590 [ERR] switch_ivr_async.c:4761 Event queue failed! 🔴
2024-01-18 11:56:47.618585 [DEBUG] mod_vosk.c:140 Sending data 3200
2024-01-18 11:56:47.678581 [DEBUG] mod_vosk.c:164 Recieved 48 bytes:
{
  "partial" : "two minutes and five seconds"
}
2024-01-18 11:56:47.678581 [ERR] switch_ivr_async.c:4761 Event queue failed! 🔴

Given the error's non-impact on the transcription's continuity, it might not be critical; however, the fact that this error persists until the call is dropped suggests a potential underlying issue in the event queue mechanism. Additionally, I've checked the vosk-server side and haven't noticed any errors corresponding to these occurrences.

Could you advise if this is an error that can be safely ignored? I'm concerned it might lead to more problematic behavior when there are multiple, simultaneous transcriptions occurring.

Thank you once again for your assistance.

nshmyrev commented 10 months ago

Event queue failed means that event queue is full. There is a limit of 5000 events there. You need something to pull events from the queue and it will be fine.