neutrinolabs / xrdp

xrdp: an open source RDP server
http://www.xrdp.org/
Apache License 2.0
5.75k stars 1.73k forks source link

xrdp process seems to be using wrong IP stack for connection to sesman #1805

Closed BrianDead closed 2 years ago

BrianDead commented 3 years ago

I'm trying to get to the bottom of a problem where xrdp has stopped working for me recently. The symptom is that when I connect (using Microsoft remote desktop from a Mac) I just get a blank black screen. It was working great in the past.

Looking in the logs, I can see that at the point where xrdp is trying to connect to sesman to initiate an Xorg session, it fails - I see no corresponding connection received in the xrdp-sesman.log. In the xrdp log I see error messages about closed connections, suggesting that the xrdp process is binding to an IPv6 stack on the interface even though it's trying to connect to 127.0.0.1

xrdp-sesman.log

Feb 10 19:56:48 plethora xrdp[21818]: (21818)(140681374807872)[INFO ] TLS connection established from 10.46.162.2 port 51147: TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384
Feb 10 19:56:48 plethora xrdp[21818]: (21818)(140681374807872)[DEBUG] xrdp_0000553a_wm_login_mode_event_00000001
Feb 10 19:56:48 plethora xrdp[21818]: (21818)(140681374807872)[INFO ] Loading keymap file /etc/xrdp/km-00000809.ini
Feb 10 19:56:48 plethora xrdp[21818]: (21818)(140681374807872)[WARN ] local keymap file for 0x00000809 found and doesn't match built in keymap, using local keymap file
Feb 10 19:56:48 plethora xrdp[21818]: (21818)(140681374807872)[DEBUG] xrdp_wm_log_msg: connecting to sesman ip 127.0.0.1 port 3350
Feb 10 19:56:52 plethora xrdp[21818]: (21818)(140681374807872)[DEBUG] Closed socket 18 (AF_INET6 fe80::cf20:a5a:7fa0:72e3 port 44516)
Feb 10 19:56:56 plethora xrdp[21818]: (21818)(140681374807872)[DEBUG] Closed socket 18 (AF_INET6 fe80::cf20:a5a:7fa0:72e3 port 44518)
Feb 10 19:57:00 plethora xrdp[21818]: (21818)(140681374807872)[DEBUG] Closed socket 18 (AF_INET6 fe80::cf20:a5a:7fa0:72e3 port 44520)
Feb 10 19:57:04 plethora xrdp[21818]: (21818)(140681374807872)[ERROR] xrdp_wm_log_msg: Error connecting to sesman: 127.0.0.1 port: 3350
Feb 10 19:57:04 plethora xrdp[21818]: (21818)(140681374807872)[DEBUG] Closed socket 18 (AF_INET6 fe80::cf20:a5a:7fa0:72e3 port 44522)
Feb 10 19:57:04 plethora xrdp[21818]: (21818)(140681374807872)[DEBUG] return value from xrdp_mm_connect 1

If I run netcat to connect to port 3350 on 127.0.0.1 from the command-line, I see a log entry in xrdp-sesman.log:

Feb 10 19:57:30 plethora xrdp-sesman[21705]: (21705)(139921126086208)[INFO ] A connection received from ::ffff:127.0.0.1 port 59830
Feb 10 19:57:32 plethora xrdp-sesman[21705]: (21705)(139921126086208)[WARN ] libscp network error.
Feb 10 19:57:32 plethora xrdp-sesman[21705]: (21705)(139921126086208)[DEBUG] Closed socket 8 (AF_INET6 ::ffff:127.0.0.1 port 3350)

The fact that I see no log messages like this while trying to actually establish a session suggests to me that the connection attempt from xrdp is failing. The fact that the apparent source address of the connection is the ipv6 address, when it's supposed to be connecting to IPv4 127.0.0.1 would seem to be a strong candidate.

The configuration for the Xorg session in xrdp.ini is as follows:

[Xorg]
name=Xorg
lib=libxup.so
username=ask
password=ask
ip=127.0.0.1
port=-1
code=20

The corresponding config in sesman looks like this:

[Globals]
ListenAddress=127.0.0.1
ListenPort=3350
EnableUserWindowManager=true
; Give in relative path to user's home directory
UserWindowManager=startwm.sh
; Give in full path or relative path to /etc/xrdp
DefaultWindowManager=startwm.sh
; Give in full path or relative path to /etc/xrdp
ReconnectScript=reconnectwm.sh

This is the output from xrdp --help:

xrdp 0.9.12
  A Remote Desktop Protocol Server.
  Copyright (C) 2004-2018 Jay Sorg, Neutrino Labs, and all contributors.
  See https://github.com/neutrinolabs/xrdp for more information.

  Configure options:
      --enable-ipv6
      --enable-jpeg
      --enable-fuse
      --enable-rfxcodec
      --enable-opus
      --enable-painter
      --enable-vsock
      --build=x86_64-linux-gnu
      --prefix=/usr
      --includedir=${prefix}/include
      --mandir=${prefix}/share/man
      --infodir=${prefix}/share/info
      --sysconfdir=/etc
      --localstatedir=/var
      --disable-silent-rules
      --libdir=${prefix}/lib/x86_64-linux-gnu
      --libexecdir=${prefix}/lib/x86_64-linux-gnu
      --disable-maintainer-mode
      --disable-dependency-tracking
      --with-socketdir=/run/xrdp/sockdir
      build_alias=x86_64-linux-gnu
      CFLAGS=-g -O2 -fdebug-prefix-map=/build/xrdp-GJgww4/xrdp-0.9.12=. -fstack-protector-strong -Wformat -Werror=format-security 
      LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now -Wl,--as-needed
      CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2 
      PKG_CONFIG_PATH=/build/xrdp-GJgww4/xrdp-0.9.12/pkgconfig

  Compiled with OpenSSL 1.1.1f  31 Mar 2020
BrianDead commented 3 years ago

I tried rebuilding it myself without the IPv6 support enabled, and it got better. It seems that it fails because I have IPv6 disabled on the loopback interface but enabled on one of the ethernet interface.

I think there's a problem with the logic in g_tcp_connect and connect_loopback functions when ipv6 is enabled on some interfaces but not on loopback.

If the target IP address for the connection from xrdp to sesman is 127.0.0.1 (using strcmp), then g_tcp_connect delegates connection to connect_loopback. if (g_strcmp(address, "127.0.0.1") == 0) { return connect_loopback(sck, port); }

Connect_loopback now just ignores the loopback address specified in the config and just tries to connect three ways - first with IPv6 using the in6addr_loopback (::1), then if that fails it tries IPv4 with INADDR_LOOPBACK (127.0.0.1), then finally it tries the combination ::FFFF:127.0.0.1

I think the problem here is that sesman isn't actually listening on the IPv6 loopback, but there is a valid IPv6 stack on the system. The first part of connect_loopback is attempting to make the connection, getting a EINPROGRESS response because the stack has sent the SYN but has not received a response, then it never proceeds to try the IPv4 version which the config was actually asking it to use all along.

I don't know what the answer is in code terms, but it does feel like the logic of trying an IPv6 loopback even when configured to use the IPv4 loopback address is problematic. To be honest, I'm not sure I meant to be in that state, and I'm not sure whether it's even a valid configuration but it was there and the system was running.

matt335672 commented 3 years ago

Hi @BrianDead

[Nice handle, BTW]

You're right that this area is problematic.

In the short- to medium-term we're going to get xrdp and xrdp-sesman conversing over UNIX domain socket. This will take IPv4 and IPv6 out of the inter-process comms completely and problems like this will just go away.

With that in mind, I'm personally not keen on making a lot of changes in this connection code at the moment; it's been changed quite a few times in the past and I've a horrible feeling we'll just end up breaking someone else's working config.

In the short term, we need to find a fix for your problem however.

How about getting xrdp-sesman to listen on ::1 ? If I understand your problem as described above, that might work.

To do that:-

If I do that on my test system:-

$ sudo ss -tlnp | grep xrdp-sesman
LISTEN    0         2                    [::1]:3350                [::]:*        users:(("xrdp-sesman",pid=23264,fd=7))                                         

Is that a good workaround?

BrianDead commented 3 years ago

Hey @matt335672

I think the problem only showed itself because I had disabled IPv6 on the loopback interface, while it was still enabled on one of the regular interfaces. This was why the IPv6 loopback connection kind of worked, but ultimately failed.

Getting xrdp-sesman to listen on ::1 would not work in this scenario because there was no loopback interface with an appropriate address for it to listen on. I did try this before I remembered that IPv6 was disabled on loopback.

One thing I did try was changing the listening address to 127.0.0.2 and changing the corresponding connection address in the Xorg service definition in xrdp.ini. This made the connection between xrdp and xrdp-sesman work (because it failed the strcmp and bypassed the connect_loopback function) but I still didn't get to the point that I could actually see or use the desktop. I suspect there may be other connections happening somewhere later in the process, maybe with Xorg itself, that are also failing in this situation.

As soon as I re-enabled ipv6 on loopback, with sudo sysctl -w net.ipv6.conf.lo.disable_ipv6=0 everything worked with the --enable-ipv6 build.

So I'd say this is definitely a rare edge case issue. Most folks won't have my weird, and accidental, IPv6 configuration. If you're planning a re-architecture of these connections to take the IP stack out of the equation, then it may not be worth fixing. But if this connect_loopback function is used for connections with other components than just xrdp-sesman then it may still crop up as an issue from time to time...

matt335672 commented 3 years ago

OK - thanks for the clarification. I can see where you're coming from. It's not a usual configuration, but that doesn't mean it shouldn't be supportable.

I think we'll leave this particular issue open for now. We can close it when we've moved sesman to UNIX domain sockets.

benefacto commented 2 years ago

FYI, for anyone else who encounters this or a similar issue: I had IPV6 disabled on my machine and had to re-enable it for xrdp to be able to connect.

matt335672 commented 2 years ago

This is now moot after the move to UDS (PR #2207)