SOZ-Faut-etre-Sub / ZUMBLE

Rust implementation of a mumble server for FiveM
MIT License
80 stars 18 forks source link

UDP packets can *not* be received #2

Open rdlh opened 1 year ago

rdlh commented 1 year ago

Hi there!

First, thanks for the amazing jobs you guys have made.

We're having an issue using ZUMBLE on a french 250+ simultaneous player server. Sometimes (that's the issue), players are stuck in the default Mumble channel and people can speak, but can't hear others. After a random number of login / logout, it just works.

The only thing we've found after some investigations are those logs for EVERY stucked player :

UDP packets can *not* be received. Switching to TCP tunnel mode

And seconds later :

UDP packets can be received. Switching to UDP mode.

We're using pma-voice lastest version and FiveM build 6231

Do you have any idea about how to fix this?

Cc @ChaosQK

joelwurtz commented 1 year ago

There are no logs on the zumble process that show why those players can't connect to udp ? Do you use the latest version also ?

FYI a client to server connection works like this :

rdlh commented 1 year ago

Thanks for your fast answer !

I can send you some anonymised logs if needed to help you find what's going on, feel free to ping me on Discord Knid#4242 We pulled ZUMBLE repo yesterday before production release from oss branch, so I think we're using the lastest version.

We tried to implement soz-voip but it's not compatible with the phone resource we're using and soz-phone is not immediatly an option for us as we're unfortunately using ESX,

rdlh commented 1 year ago

We're also seeing some logs spamming sometimes saying:

2023-03-29T02:01:24.116090Z ERROR zumble::server::udp: cannot send voice packet to client: no available capacity

Is this an issue we can fix on our end?

joelwurtz commented 1 year ago

No we are also facing this issue, it can happens sometime, don't know why it happens, something is blocking the client loop and i don't see what, this result in the message queue for this client to be full and this error happens.

I believe there is some race conditions when someone disconnect (or something else), but normally those errors should not stay long, as when it's blocked the tcp / udp connections is "freeze" and then the client reconnects again with a new account and this one is dropped afterwards

joelwurtz commented 1 year ago

But this may be related to your issue, i will try to debug this in the coming days to not have this error anymore

galexrt commented 1 year ago

I have no experience in Rust, but reading the Tokio docs suggests it might be all about the channel buffer sizes (maybe if a client just takes longer to be sent the messages as you said with the timeout/disconnect till the client is dropped) (I have increased the targets as well, though it should be unrelated to the actual issue)

I wonder if these changes below make any sense to someone who knows Rust/Zumble or if the changes are just "postponing" the problem to a "later time" (daily restarts FTW?). The patch below makes the situation a bit better for us now.

Here's the patch diff:

diff --git a/src/client.rs b/src/client.rs
index 4c6d448..04171c0 100644
--- a/src/client.rs
+++ b/src/client.rs
@@ -71,8 +71,8 @@ impl Client {
         publisher: Sender<ClientMessage>,
     ) -> Self {
         let tokens = authenticate.get_tokens().iter().map(|token| token.to_string()).collect();
-        let mut targets = Vec::with_capacity(30);
-        targets.resize_with(30, Default::default);
+        let mut targets = Vec::with_capacity(128);
+        targets.resize_with(128, Default::default);

         Self {
             version,
diff --git a/src/handler/voice_packet.rs b/src/handler/voice_packet.rs
index a5c33ad..4d83c97 100644
--- a/src/handler/voice_packet.rs
+++ b/src/handler/voice_packet.rs
@@ -86,8 +86,9 @@ impl Handler for VoicePacket<Clientbound> {
                             Ok(_) => {}
                             Err(err) => {
                                 tracing::error!(
-                                    "error sending voice packet message to {}: {}",
+                                    "error sending voice packet message to {} (capacity: {}): {}",
                                     client_read.authenticate.get_username(),
+                                    client_read.publisher.capacity(),
                                     err
                                 );
                             }
diff --git a/src/server/tcp.rs b/src/server/tcp.rs
index aa81fc6..4e0ed51 100644
--- a/src/server/tcp.rs
+++ b/src/server/tcp.rs
@@ -60,7 +60,7 @@ async fn handle_new_client(acceptor: TlsAcceptor,
     let (version, authenticate, crypt_state) = Client::init(&mut stream, server_version).await.context("init client")?;

     let (read, write) = io::split(stream);
-    let (tx, rx) = mpsc::channel(128);
+    let (tx, rx) = mpsc::channel(1024);

     let username = authenticate.get_username().to_string();
     let client = {
diff --git a/src/server/udp.rs b/src/server/udp.rs
index 0a781f4..fdc5eb1 100644
--- a/src/server/udp.rs
+++ b/src/server/udp.rs
@@ -151,7 +151,7 @@ async fn handle_packet(mut buffer: BytesMut, size: usize, addr: SocketAddr, prot
                 (Some(client), Some(packet)) => {
                     {
                         tracing::info!(
-                            "UPD connected client {} on {}",
+                            "UDP connected client {} on {}",
                             client.read_err().await?.authenticate.get_username(),
                             addr
                         );
diff --git a/src/state.rs b/src/state.rs
index b2d7449..5bfa20f 100644
--- a/src/state.rs
+++ b/src/state.rs
@@ -176,7 +176,12 @@ impl ServerState {
                 }) {
                     Ok(_) => {}
                     Err(err) => {
-                        tracing::error!("failed to send message to {}: {}", client_read.authenticate.get_username(), err);
+                        tracing::error!(
+                            "failed to send message to {} (capacity: {}): {}",
+                            client_read.authenticate.get_username(),
+                            client_read.publisher.capacity(),
+                            err
+                        );
                     }
                 }
             }

If I can help with debugging this let me know.

NariieL commented 1 year ago

Small update: On SOZ with 310 players connected simultaneously, we had no VOIP problems.

NariieL commented 1 year ago

Small update² : Still no problems with 387 players connected. I hope the problem has been solved on your side. <3

Korioz commented 1 year ago

Hey, we've been using it on a 1100~ peak players average server, it had only one issue, when we reached 1000 players the sound just cut off for every new connections, like if there was a connection pool limit.

I've applied some tiny patches which helped resolve some issues, you can check it out here

Anyways, it's still a prototype but this project is a viable replacement for others external mumble server solutions.

TheiLLeniumStudios commented 10 months ago

@Korioz are you using ZUMBLE with pma-voice? or soz-voip?

We're having issues with servers that use pma-voice with ZUMBLE where 99% players are not assigned a channel / the channel doesn't get created for them and are placed in the root channel and cannot hear each other at all.

We've also tried explicitly invoking MumbleCreateChannel before assigning a player to it but it seems like it has no effect and that can be seen if we join via a desktop mumble client

trong0981 commented 10 months ago

Zumble using sync thread for all tcp and udp. You need change to unsync for udp is fixed all your problem.

Vào Th 4, 17 thg 1, 2024 lúc 12:36 iLLeniumStudios @.***> đã viết:

@Korioz https://github.com/Korioz are you using ZUMBLE with pma-voice? or soz-voip?

We're having issues with servers that use pma-voice with ZUMBLE where 99% players are not assigned a channel / the channel doesn't get created for them and are placed in the root channel and cannot hear each other at all.

We've also tried explicitly invoking MumbleCreateChannel before assigning a player to it but it seems like it has no effect and that can be seen if we join via a desktop mumble client

— Reply to this email directly, view it on GitHub https://github.com/SOZ-Faut-etre-Sub/ZUMBLE/issues/2#issuecomment-1894972971, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACSLP4KZ6EOYLS3LDONHJWDYO5PNPAVCNFSM6AAAAAAWLD3LA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJUHE3TEOJXGE . You are receiving this because you commented.Message ID: @.***>

TheiLLeniumStudios commented 10 months ago

@trong0981 can you explain what do you mean by that? Are you saying async is causing issues and there are race conditions because of that?

joelwurtz commented 10 months ago

Also don't understand by what you mean about unsync thread ? If you give a snippet on what you change would be nice so we can portback to this library and make it profit for everyone

trong0981 commented 10 months ago

I using Zumble, it good. but my knowledge of Rust language is not good enough to be able to fix errors. I think this trick can help you fix all problems still avaible in Zumble.

There are 2 problems that if can be solved, I think all the errors that people often make will be completely resolved:

  1. Take advantage of UnboundedReceiver & UnboundedSender to avoid having to create channels for tokio's multi-threaded processing. Handle deadlock problem when using Unbounded if there are too many clients or VPS (DC) is too weak.

  2. Completely delete the chat room if no one left in the room to avoid memory overflow due to too many chat rooms.

TheiLLeniumStudios commented 10 months ago

Made a bunch of changes over at my fork that has now resolved all the issues.

With all these changes, Zumble runs pretty stable and I am not seeing any issues so far.

Also the reason of making the channels permanent is that the channels get cleaned up if no one is there but the same channel with the same channel id CANNOT be recreated, which is why I just made them permanent. If anyone can help with figuring out why this happens, that would be awesome. Maybe @joelwurtz might have some idea

I will however look at the UnboundedReceiver and UnboundedSender classes. I have 0% knowledge of RUST so I'm not confident that I'll be able to fix the actual root cause which is why I just forked it rather than contributing hotfixes

Link to my fork: https://github.com/iLLeniumStudios/ZUMBLE

joelwurtz commented 10 months ago

Unbounded only means there is no more limit on the message buffer (when sending / receiving message between threads), i'm not a fan since if there is a problem when consuming a message it will grow without errors (so it can destroy your memory) and also it will use more cpu (as it need to allocate memory on the fly for the buffer, where as a fixed buffer will only be allocated on start)

But increasing the capacity should be okay like you have done, i presume most of those errors happens when one of the client thread is blocked due to a disconnect (it await for some times writing the message to the client) and a lot of voice message get stacked (until there is no more space left on the buffer).

I will try to take a look at why channel cannot be recreated, since we don't use them i have not tested them a lot (we only use target even for proximity)

TheiLLeniumStudios commented 10 months ago

Thanks for looking into it @joelwurtz I suspect that it has something to do with channel not getting removed from the state. Or some sort of a race condition between removal and creation for the same channel id. I tried to remove the channel inside of the temporary check itself but ran into a bunch of mutable errors that I failed to fix. I'd assume that's why you're trying to remove it from the channels state map in the handler itself rather than right after checking that it's not needed.

TheiLLeniumStudios commented 10 months ago

On the plus side, it seems to be working well even with permanent channels. Here's how the CPU and Memory usage looks like for the past 6 hours where the players started from 450 and then went down to 100

image
joelwurtz commented 10 months ago

Channel does not consume a lot of memory so even if you have like 10 000 channels it should use relatively low memory

TheiLLeniumStudios commented 10 months ago

Makes sense. This issue is not a blocker by the way but I'll try to add some debug logs around removing channels to see if I can figure it the problem (even if I can't fix it myself). Once we fix it somehow, I'll be happy to raise a PR for my changes and the workflows back to this repo

TheiLLeniumStudios commented 10 months ago

So I tried to debug this and it looks like the remove channel statement never gets triggered (when a player disconnects), neither on channel_state: https://github.com/SOZ-Faut-etre-Sub/ZUMBLE/blob/oss/src/handler/channel_state.rs#L84 nor on user_state: https://github.com/SOZ-Faut-etre-Sub/ZUMBLE/blob/oss/src/handler/user_state.rs#L36

I don't know the exact reason but that is what's basically causing the channels to be there and not get recreated

FingerlessGlov3s commented 1 month ago

@TheiLLeniumStudios

Made a bunch of changes over at my fork that has now resolved all the issues.

Does this fix the issue, where if I toggled off my Voice Chat and then back on again, people could hear me but I couldn't hear them? I was also getting various issues, where rejoining the server would fix voice for someone.

Just tried to deploy this repos' versions of Zumble to server of 200-300 players, and proximity chat was very hit n miss working, radio always worked though. Using PMA-Voice