jitsi / jigasi

Jigasi: a server-side application acting as a gateway to Jitsi Meet conferences. Currently allows regular SIP clients to join meetings and provides transcription capabilities.
Apache License 2.0
515 stars 291 forks source link

fix: streaming session handling should account for dynamically changed transcription language #534

Closed loli10K closed 2 months ago

loli10K commented 2 months ago

We have been working on the ability to change the transcription language dynamically from the web UI: a participant modifying their local settings (Three dots -> Settings -> General -> Language) would be able to start speaking in another language and still get an accurate transcription.

We are testing this functionality with local Vosk instances and it's working fine, however we had to make a very small modification to the code: without this one line change, Jigasi starts creating a lot of new WebSockets and eventually exhausts all resources. This is because in sendRequest() we never keep track of newly created TranscriptionService.StreamingRecognitionSessions.

While we understand that changing transcription language dynamically is not something that is natively supported (yet) by Jigasi this small change is probably needed anyway to keep track of all the streaming sessions.

loli10K commented 2 months ago

Actually, i can confirm this is indeed needed anyway to handle "prematurely ended streaming sessions".

Without this change this is what happens if we restart the Vosk container while speaking:

docker-jitsi-meet-jigasi-1          | 2024-04-28 09:03:37.033 SEVERE: [83] VoskTranscriptionService$VoskWebsocketStreamingSession.onError#306: Error while streaming audio data to transcription service
docker-jitsi-meet-jigasi-1          | java.nio.channels.ClosedChannelException
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.websocket.core.internal.WebSocketSessionState.onEof(WebSocketSessionState.java:169)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.onEof(WebSocketCoreSession.java:253)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.fillAndParse(WebSocketConnection.java:482)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.onFillable(WebSocketConnection.java:340)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:416)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:385)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:272)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:140)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:936)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1080)
docker-jitsi-meet-jigasi-1          |   at java.base/java.lang.Thread.run(Thread.java:840)
docker-jitsi-meet-vosk-en-1 exited with code 0
docker-jitsi-meet-vosk-en-1 exited with code 0
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ReadDataFiles():model.cc:213) Decoding params beam=10 max-active=3000 lattice-beam=2
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ReadDataFiles():model.cc:216) Silence phones 1:2:3:4:5:6:7:8:9:10
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 0 orphan nodes.
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 0 orphan components.
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ReadDataFiles():model.cc:249) Loading i-vector extractor from /opt/vosk-model/ivector/final.ie
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:204) Done.
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ReadDataFiles():model.cc:285) Loading HCL and G from /opt/vosk-model/graph/HCLr.fst /opt/vosk-model/graph/Gr.fst
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ReadDataFiles():model.cc:311) Loading winfo /opt/vosk-model/graph/phones/word_boundary.int
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39138)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39146)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39150)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39166)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39180)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39192)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39204)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39216)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39230)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39238)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39246)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39256)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39268)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39282)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39288)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39292)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 39294)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 53470)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 53486)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 53502)
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 53512)

Basically every sendRequest() call creates a new WebSocket/Thread eventually exhausting all JVM resources.

With the fix applied:

docker-jitsi-meet-jigasi-1          | 2024-04-28 09:06:23.923 SEVERE: [76] VoskTranscriptionService$VoskWebsocketStreamingSession.onError#306: Error while streaming audio data to transcription service
docker-jitsi-meet-jigasi-1          | java.nio.channels.ClosedChannelException
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.websocket.core.internal.WebSocketSessionState.onEof(WebSocketSessionState.java:169)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.websocket.core.internal.WebSocketCoreSession.onEof(WebSocketCoreSession.java:253)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.fillAndParse(WebSocketConnection.java:482)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.websocket.core.internal.WebSocketConnection.onFillable(WebSocketConnection.java:340)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:416)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:385)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:272)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:140)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:936)
docker-jitsi-meet-jigasi-1          |   at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1080)
docker-jitsi-meet-jigasi-1          |   at java.base/java.lang.Thread.run(Thread.java:840)
docker-jitsi-meet-vosk-en-1 exited with code 0
docker-jitsi-meet-vosk-en-1 exited with code 0
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ReadDataFiles():model.cc:213) Decoding params beam=10 max-active=3000 lattice-beam=2
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ReadDataFiles():model.cc:216) Silence phones 1:2:3:4:5:6:7:8:9:10
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 0 orphan nodes.
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 0 orphan components.
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ReadDataFiles():model.cc:249) Loading i-vector extractor from /opt/vosk-model/ivector/final.ie
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:204) Done.
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ReadDataFiles():model.cc:285) Loading HCL and G from /opt/vosk-model/graph/HCLr.fst /opt/vosk-model/graph/Gr.fst
docker-jitsi-meet-vosk-en-1         | LOG (VoskAPI:ReadDataFiles():model.cc:311) Loading winfo /opt/vosk-model/graph/phones/word_boundary.int
docker-jitsi-meet-vosk-en-1         | INFO:root:Connection from ('172.30.0.7', 51722)
docker-jitsi-meet-vosk-en-1         | INFO:root:Config {'sample_rate': 48000.0}
damencho commented 2 months ago

Hi, thanks for your contribution! If you haven't already done so, could you please make sure you sign our CLA (https://jitsi.org/icla for individuals and https://jitsi.org/ccla for corporations)? We would, unfortunately, be unable to merge your patch unless we have that piece :(.

loli10K commented 2 months ago

could you please make sure you sign our CLA

Done, thank you.

damencho commented 2 months ago

Thank you for your contribution.