mumble-voip / mumble

Mumble is an open-source, low-latency, high quality voice chat software.
https://www.mumble.info
Other
6.28k stars 1.11k forks source link

Mumble takes forever to load the server list #6240

Open mataha opened 11 months ago

mataha commented 11 months ago

Description

Loading the server list, whether during opening Mumble or switching servers, takes a rather long time.

Steps to reproduce

  1. Open Mumble.
  2. Click Server > Connect.
  3. That's it.

Mumble version

1.4.287

Mumble component

Client

OS

Windows

Reproducible?

Yes

Additional information

I've seen #4894, though that one seems to be macOS specific...?

Relevant log output

The only line of possible relevance in the log is:

<W>2023-10-12 21:35:32.028 Public list disabled

That said, neither enabling nor disabling the public list had any effect. WinDbg hasn't shown anything interesting either.

Screenshots

https://github.com/mumble-voip/mumble/assets/7210216/c069d17a-e23a-482c-ab28-42dba86de811

Krzmbrzl commented 11 months ago

The only line of possible relevance in the log is:

2023-10-12 21:35:32.028 Public list disabled

In that case I am quite sure that the issue is not that the list doesn't have a long load time, but that it is indeed disabled. And indeed, checking the code it is clear that you only see this log message if the list is disabled: https://github.com/mumble-voip/mumble/blob/c21a90a831ad12f95995bc55ce5e864a0b2b24e9/src/mumble/ConnectDialog.cpp#L115-L125

That said, neither enabling nor disabling the public list had any effect.

I take it that you properly saved the settings after making changes to this setting, right? When you did that, did the setting change persist a restart?

Hartmnt commented 11 months ago

I think OP is talking about the time it takes to ping the favorites and display ping and user count

mataha commented 11 months ago

I think OP is talking about the time it takes to ping the favorites and display ping and user count

Indeed.

Krzmbrzl commented 11 months ago

Ah, so this is not about the server list showing up at all but rather about the ping refreshing. Got it :thinking: What is your ping to the Mumble central server? That's probably the bottleneck here...

mataha commented 11 months ago

Ah, so this is not about the server list showing up at all but rather about the ping refreshing. Got it 🤔 What is your ping to the Mumble central server? That's probably the bottleneck here...

mataha@shuchiin:~$ ping -i 0.1 -D -c 10 publist.mumble.info
PING dualstack.osff.map.fastly.net (151.101.38.217) 56(84) bytes of data.
[1697487549.394536] 64 bytes from 151.101.38.217: icmp_seq=1 ttl=58 time=92.3 ms
[1697487559.501814] 64 bytes from 151.101.38.217: icmp_seq=2 ttl=58 time=92.7 ms
[1697487569.609763] 64 bytes from 151.101.38.217: icmp_seq=3 ttl=58 time=93.3 ms
[1697487579.717402] 64 bytes from 151.101.38.217: icmp_seq=4 ttl=58 time=92.8 ms
[1697487589.825502] 64 bytes from 151.101.38.217: icmp_seq=5 ttl=58 time=93.6 ms
[1697487599.933087] 64 bytes from 151.101.38.217: icmp_seq=6 ttl=58 time=93.0 ms
[1697487610.040171] 64 bytes from 151.101.38.217: icmp_seq=7 ttl=58 time=93.9 ms
[1697487620.147881] 64 bytes from 151.101.38.217: icmp_seq=8 ttl=58 time=93.1 ms
[1697487630.255813] 64 bytes from 151.101.38.217: icmp_seq=9 ttl=58 time=93.3 ms
[1697487630.443650] 64 bytes from 151.101.38.217: icmp_seq=10 ttl=58 time=91.9 ms

--- dualstack.osff.map.fastly.net ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 90970ms
rtt min/avg/max/mdev = 91.920/92.994/93.863/0.557 ms

There's a 10 second interval between each response - I doubt that's normal. Here's mumble.info for comparison:

mataha@shuchiin:~$ ping -i 0.1 -D -c 10 mumble.info
PING mumble.info (159.100.252.167) 56(84) bytes of data.
[1697487777.499669] 64 bytes from mumble.info (159.100.252.167): icmp_seq=1 ttl=51 time=101 ms
[1697487777.610677] 64 bytes from mumble.info (159.100.252.167): icmp_seq=2 ttl=51 time=104 ms
[1697487777.717841] 64 bytes from mumble.info (159.100.252.167): icmp_seq=3 ttl=51 time=100 ms
[1697487777.823062] 64 bytes from mumble.info (159.100.252.167): icmp_seq=4 ttl=51 time=101 ms
[1697487777.931776] 64 bytes from mumble.info (159.100.252.167): icmp_seq=5 ttl=51 time=102 ms
[1697487778.039693] 64 bytes from mumble.info (159.100.252.167): icmp_seq=6 ttl=51 time=101 ms
[1697487778.153609] 64 bytes from mumble.info (159.100.252.167): icmp_seq=7 ttl=51 time=108 ms
[1697487778.255993] 64 bytes from mumble.info (159.100.252.167): icmp_seq=8 ttl=51 time=104 ms
[1697487778.357902] 64 bytes from mumble.info (159.100.252.167): icmp_seq=9 ttl=51 time=105 ms
[1697487778.454806] 64 bytes from mumble.info (159.100.252.167): icmp_seq=10 ttl=51 time=101 ms

--- mumble.info ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 955ms
rtt min/avg/max/mdev = 100.156/102.712/107.514/2.290 ms, pipe 2
zetanor commented 11 months ago

It'll be faster if there's a "_mumble" TCP service record in the DNS for the host you're pinging (i.e., a SRV entry with prefix "_mumble._tcp"). See ServerResolverPrivate::resolve https://github.com/mumble-voip/mumble/blob/c21a90a831ad12f95995bc55ce5e864a0b2b24e9/src/ServerResolver.cpp#L46C35-L46C35

mataha commented 11 months ago

It'll be faster if there's a "_mumble" TCP service record in the DNS for the host you're pinging (i.e., a SRV entry with prefix "_mumble._tcp"). See ServerResolverPrivate::resolve https://github.com/mumble-voip/mumble/blob/c21a90a831ad12f95995bc55ce5e864a0b2b24e9/src/ServerResolver.cpp#L46C35-L46C35

I'm a regular user, not a server maintainer - I have no control over a server's DNS records.

Anyway, this has nothing to do with the servers themselves; on other, third-party Mumble clients (e.g. Plumble Free for Android) there's no delay involved - the same server list gets refreshed instantly. It's not a matter of latency as well; I can observe that, after the aforementioned delay, servers farther away get shown a few milliseconds later, as one would have expected from the distance.

What's causing the initial delay?

zetanor commented 11 months ago

It'll be faster if there's a "_mumble" TCP service record in the DNS for the host you're pinging (i.e., a SRV entry with prefix "_mumble._tcp"). See ServerResolverPrivate::resolve https://github.com/mumble-voip/mumble/blob/c21a90a831ad12f95995bc55ce5e864a0b2b24e9/src/ServerResolver.cpp#L46C35-L46C35

I'm a regular user, not a server maintainer - I have no control over a server's DNS records.

Anyway, this has nothing to do with the servers themselves; on other, third-party Mumble clients (e.g. Plumble Free for Android) there's no delay involved - the same server list gets refreshed instantly. It's not a matter of latency as well; I can observe that, after the aforementioned delay, servers farther away get shown a few milliseconds later, as one would have expected from the distance.

What's causing the initial delay?

I didn't look at the source and I know nothing about the frameworks involved, plus your problem might be different from what I observed a long time ago, but my guess is that there's an issue in the way Qt resolves SRV records. Do you get the initial ping delay on both "mumble.hole.observer" (which has a proper SRV record) and "mumblenosrv.hole.observer" (which doesn't)?

mataha commented 11 months ago

Huh. I stand corrected - on 19/20 attempts there was no delay on mumble.hole.observer.

So this is a Qt issue, then? I wonder when was it introduced...

zetanor commented 11 months ago

Huh. I stand corrected - on 19/20 attempts there was no delay on mumble.hole.observer.

So this is a Qt issue, then? I wonder when was it introduced...

It's a bit more complicated than a simple code bug. Most people don't know what a service record is or how they semantically work, so it's more of a complicated surprise than a bug. To simplify it, let's say that it's a DNS entry that returns a TCP/UDP port rather than an IP address. You'll find that "mumble.hole.observer" pings properly no matter what port you type in, whereas "mumblenosrv.hole.observer" firmly needs to be on port 64738 to ping.

Insofar as Mumble is coded to ask for a service record before using the provided port, and insofar as Qt's DNS library does what it's being asked, there's no bug. That said, it could be argued that Mumble should only ever do a SRV lookup if no explicit port is provided. Shouldn't be too difficult to change client to allow a blank port, and to use that a switching condition between a standard DNS resolution and a SRV-first (with fallback to 64738) resolution.

In the meantime, using a direct IP entry should never cause a SRV lookup. Alternatively, using a higher performance DNS server (I use IBM's unfiltered Quad9 "9.9.9.10") might give you faster pings on SRV-less (since you'll get a negative answer faster).

mataha commented 10 months ago

Thanks for the explanation - it's a bit clearer to me now, though I feel like this (SRV record lookup) should be - as you've already said - optional.

Direct IP it is (I'm on 9.9.9.10 already).