mumble-voip / mumble

Mumble is an open-source, low-latency, high quality voice chat software.
https://www.mumble.info
Other
6.28k stars 1.11k forks source link

Compiling murmur with gRPC enabled renders the server useless #4201

Closed theAkito closed 3 years ago

theAkito commented 4 years ago

Describe the bug I've sat for hours on this issue, because there aren't really any symptoms besides the Server connection failed: Connection refused. error message when trying to connect. If I compile any version of murmur in a Docker image, then I experience this error. Compilation finishes successfully, server starts without any problems whatsoever and NO runtime errors are evident from the server log. It's like the server is perfectly healthy. Yet, you cannot connect.

The workaround is to disable compilation with gRPC support. Now it works. Don't know why, as there aren't any real error messages.

Steps to Reproduce Steps to reproduce the behavior:

  1. Compile murmur with gRPC enabled.
  2. Try to connect to the server.
  3. Get Connection refused errors in the client.

Expected behavior Server should work with gRPC enabled.

Desktop (please complete the following information):

Additional context Remove the grpc here and all is good: https://github.com/theAkito/docker-murmur/blob/45bf3718e0a3fd83474af344ea72ffb3076340dd/Dockerfile#L42

Krzmbrzl commented 4 years ago

Could you try the same with the latest master branch (aka a 1.4.0 snapshot)? I never had these issues and I always include gRPC support :thinking:

theAkito commented 4 years ago

@Krzmbrzl

I changed branch in the above Dockerfile to master and compiled with gRPC support. -> Connection refused...

I only removed grpc from the referenced line. -> The server works flawlessly!

Krzmbrzl commented 4 years ago

Maybe this is a Docker-weirdness then. Any chance you could compile Mumble manually and check with that? :thinking:

theAkito commented 4 years ago

@Krzmbrzl

You are right. I copied the commands in the Dockerfile into the CLI and now it works with gRPC enabled. I initially excluded the chance of a Docker related quirk, because I am running a lot of other servers in Docker containers and every time I had a problem, it was never related to Docker itself. So this is exceptionally weird. What to do next?

Krzmbrzl commented 4 years ago

So given that you copied the instructions from the Dockerfile 1:1, I think the issue is indeed within Docker itself somehow. The question is: What is the problem? xD

But on the other hand I think there's also a bit of weirdness in the Mumble code that causes the server to refuse connections without giving a reason. And somehow this seems to be related to the gRPC code in some way... Are you using any special configuration for your server (e.g. external authenticator)?

theAkito commented 4 years ago

@Krzmbrzl

I do not use any special configuration, as far as I know. I think just performing minor edits in the murmur.ini shouldn't be anything special (no CA cert, no other special things enabled, except gRPC and even that is only enabled because the official Docker image is compiled with gRPC, so I decided to take this as a recommendation, even though gRPC support is technically experimental).

That said, in yesterday's debugging process I tried all kinds of workarounds and part of that was using a fresh, almost entirely unedited murmur.ini with no database (automatically created during first run) and it still did not work. So can't be related to my edited murmur.ini or the old database I am using.

The only thing that poked my eye when I looked at the Dockerfile again, is that I did not open the gRPC port in the firewall. However, it is still not open and the server runs flawlessly if not run within Docker, so it should not be related to that.

This issue is really weird. 🤷

Krzmbrzl commented 4 years ago

My best guess really is that there is something going on with the extrenal authenticator somehow :thinking:

theAkito commented 4 years ago

@Krzmbrzl

I found a difference in logs.

Branch master:

without Docker, with gRPC (working):

<X>2020-05-24 13:46:15.806 SSL: OpenSSL version is 'OpenSSL 1.1.1d  10 Sep 2019'
<W>2020-05-24 13:46:15.806 Initializing settings from /home/murmur/data/murmur.ini (basepath /home/murmur/data)
<W>2020-05-24 13:46:15.806 Binding to address 0.0.0.0
<W>2020-05-24 13:46:16.678 MetaParams: TLS cipher preference is "TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:AES256-SHA:AES128-SHA"
<C>2020-05-24 13:46:16.681 Successfully switched to uid 800
<W>2020-05-24 13:46:16.752 ServerDB: Opened SQLite database /home/murmur/data/murmur.sqlite
<W>2020-05-24 13:46:16.755 ServerDB: Configured SQLite for journal_mode=WAL, synchronous=NORMAL
<W>2020-05-24 13:46:16.755 Resource limits were 0 0
<W>2020-05-24 13:46:16.755 Successfully dropped capabilities
<W>2020-05-24 13:46:16.758 MurmurIce: Endpoint "tcp -h 127.0.0.1 -p 6502 -t 60000" running
<W>2020-05-24 13:46:16.761 GRPC: listening on '127.0.0.1:50051'
<W>2020-05-24 13:46:16.821 Murmur 1.4.0 (Compiled by User) running on X11: Debian GNU/Linux 10 (buster): Booting servers

with Docker, with gRPC (not working):

<X>2020-05-24 15:16:14.789 SSL: OpenSSL version is 'OpenSSL 1.1.1g  21 Apr 2020'
<W>2020-05-24 15:16:14.789 Initializing settings from /data/murmur.ini (basepath /data)
<W>2020-05-24 15:16:14.790 Binding to address 0.0.0.0
<W>2020-05-24 15:16:15.448 MetaParams: TLS cipher preference is "TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:AES256-SHA:AES128-SHA"
<W>2020-05-24 15:16:15.450 ServerDB: Opened SQLite database /data/murmur.sqlite
<W>2020-05-24 15:16:15.452 ServerDB: Configured SQLite for journal_mode=WAL, synchronous=NORMAL
<W>2020-05-24 15:16:15.464 MurmurIce: Endpoint "tcp -h 127.0.0.1 -p 6502 -t 60000" running

with Docker, without gRPC (working):

<X>2020-05-24 15:22:00.899 SSL: OpenSSL version is 'OpenSSL 1.1.1g  21 Apr 2020'
<W>2020-05-24 15:22:00.899 Initializing settings from /data/murmur.ini (basepath /data)
<W>2020-05-24 15:22:00.899 Binding to address 0.0.0.0
<W>2020-05-24 15:22:01.638 MetaParams: TLS cipher preference is "TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:AES256-SHA:AES128-SHA"
<W>2020-05-24 15:22:01.644 ServerDB: Opened SQLite database /data/murmur.sqlite
<W>2020-05-24 15:22:01.648 ServerDB: Configured SQLite for journal_mode=WAL, synchronous=NORMAL
<W>2020-05-24 15:22:01.680 MurmurIce: Endpoint "tcp -h 127.0.0.1 -p 6502 -t 60000" running
<W>2020-05-24 15:22:01.680 This version of Murmur was built without gRPC support. Ignoring 'grpc' option from configuration file.
<W>2020-05-24 15:22:01.684 OSInfo: Failed to execute lsb_release
<W>2020-05-24 15:22:01.684 Murmur 1.4.0 (Compiled by User) running on X11: Linux 4.19.0-8-amd64: Booting servers
Krzmbrzl commented 4 years ago

Ah so this is looking like murmur isn't even booting the servers up in the Docker image :thinking:

Could you run murmur with the -v option please? Maybe this adds some more info to the logs...

theAkito commented 4 years ago

@Krzmbrzl

I never ran it without -v: https://github.com/theAkito/docker-murmur/blob/45bf3718e0a3fd83474af344ea72ffb3076340dd/Dockerfile#L75

😆

streaps commented 4 years ago

I'm running 1.3.1-rc1 inside an LXC container without any problems (with gRPC enabled of course).

theAkito commented 4 years ago

@streaps Are you by any chance able to try out running it within Docker? Ideally with https://raw.githubusercontent.com/theAkito/docker-murmur/45bf3718e0a3fd83474af344ea72ffb3076340dd/Dockerfile .

It's important that you are compiling the binary yourself and not download it from the releases.

streaps commented 4 years ago

Unfortunately not.

theAkito commented 4 years ago

@Krzmbrzl

I think I found the right lead to this issue.

I changed the OS in Docker to ubunut:latest, which is the one being used in the official Docker image. Now gRPC works. However, I don't understand what the difference is between those deb packages, as they should be pretty much the same, without significant differences. Perhaps there was a breaking change in one of the libraries used and the breaking change then may be featured in the current Debian Testing distribution.

Is there an OS independent list of the dependencies and their (max.) versions? I initially wanted to create an Alpine image anyway, but I found it too exhausting to find the correct package names á la Alpine, when just having the Debian/Ubuntu package names as an example, so I changed to Debian.


I confirmed my guess. I changed the OS in Docker to the current Debian Stable (Buster) and the server runs. That's why it worked on my host, as my host runs this version, too.

Looking at all of the evidence, I think this line is the culprit: https://github.com/theAkito/docker-murmur/blob/64dcbac590d652ab6f901bee90587b1e09c344b1/Dockerfile#L54

The regex is too loose and allows any version.

https://packages.debian.org/buster/libgrpc6 https://packages.debian.org/bullseye/libgrpc9

libgrpc6 is working. libgrpc9 is not working.

Can you find out which version introduced which breaking change that caused this issue?

streaps commented 4 years ago

There is an alpine package. https://pkgs.alpinelinux.org/package/v3.11/community/x86_64/murmur

and you find the build dependencies in the APKBUILD file https://git.alpinelinux.org/aports/tree/community/mumble/APKBUILD?h=3.11-stable

for gRPC support you need grpc-dev

streaps commented 4 years ago

libgrpc9 1.26.0 is also in buster-backports and will be the only version in Ubuntu 20.10.

Alpine Linux 3.11 has version 1.25.0 and murmurd linked to libgrpc.so.8

So maybe there were some changes between 1.25.0 and 1.26.0 that broke gRPC in murmur?

Krzmbrzl commented 4 years ago

@theAkito good catch!

Is there an OS independent list of the dependencies and their (max.) versions?

Apart from CELT I don't think we have (intentional) max. versions and afaik we only have the platform specific dependency lists :thinking:

libgrpc6 is working. libgrpc9 is not working.

It'd still be very intersting why... If I had the time this would be the ideal place for testing in a VM, but I'm really busy with other stuff atm. And besides: The gRPC stuff got a rewrite in #3947 so maybe that'll solve the problem as a side-effect.

Can you find out which version introduced which breaking change that caused this issue?

At least not in a trivial way :thinking: @McKayJT do you happen to have a clue what could be wrong here? You know the gRPC stuff way better than me :)

theAkito commented 4 years ago

@streaps

Where is the libgrpc runtime package in Alpine? I can only find grpc, though I don't know if this is sufficient. Everything else seems to be indeed within the APKBUILD file, thanks.

streaps commented 4 years ago

grpc in Alpine is the lib/runtime. Btw, Alpine edge has version 1.28.0 / libgrpc.so.9

https://pkgs.alpinelinux.org/contents?branch=edge&name=grpc&arch=x86_64&repo=community

McKayJT commented 4 years ago

@McKayJT do you happen to have a clue what could be wrong here? You know the gRPC stuff way better than me :)

I know that without my patch set, if the settings for gRPC are incorrect it segmentation faults: that is, if it's unable to bind to the correct port or the certificate files are improper or cannot be read the BuildAndStart() method on the grpc::ServerBuilder object returns null and it's not checked for.

I don't ever use Docker so I'm not really in a position to test this (I used a systemd-nspawn container when I tested building for Ubuntu). Can you apply this patch to the master branch and see if you get any output?

theAkito commented 4 years ago

@McKayJT

Applied patch, but same problem.

McKayJT commented 4 years ago

Does that version have anything new in the log files, or if you run it on the command line put anything on standard out?

If it doesn't I would be interested in what this binary does. It is a static linked binary of master against grpc 1.26.0 (the same as libgrpc9) with options c++17 static no-client no-ice no-bonjour no-dbus grpc. That would let me know if it's something with gRPC itself with a newer version or the build environment.

theAkito commented 4 years ago

Does that version have anything new in the log files

No.

If it doesn't I would be interested in what this binary does. It is a static linked binary of master against grpc 1.26.0 (the same as libgrpc9) with options c++17 static no-client no-ice no-bonjour no-dbus grpc. That would let me know if it's something with gRPC itself with a newer version or the build environment.

I can do that later.

Krzmbrzl commented 3 years ago

@theAkito did you get to test what was suggested?

no-response[bot] commented 3 years ago

This issue has been automatically closed because there has been no response to our request for more information. With only the information that is currently in the issue, we don't have enough information to take action.

Please reach out if you have or find the answers we need so that we can investigate further (or if you feel like this issue shouldn't be closed for another reason).