Closed steffen-kiess closed 1 year ago
Apologies for taking so long to pop the stack on this. I am integrating it into the dev/3.1 evolving branch now. One modification I am making that I wanted to run by you:
I have introduced a new boolean vncserver command-line argument and config file variable (-uds
/$useUDS
) that, if enabled, will cause vncserver to choose a default Unix domain socket path (~/.vnc/$host:$displayNumber.uds). This will allow users or SysAdmins to set a boolean variable in turbovncserver.conf and thus enable the use of Unix domain sockets on a per-user or per-system basis, then the vncserver script will automatically choose and manage the UDS path unless -rfbunixpath
is explicitly specified. That mirrors the behavior of -rfbport
, in that a default port based on the display number is chosen unless the user explicitly overrides it. The behavior of Xvnc is unchanged from your patch.
All of this is handled by vncserver. This also opens up the possibility of using Unix domain sockets with the TurboVNC Session Manager at some point in the future. What do you think?
I think that would make sense.
Would -uds
also disable listening on TCP? (This might make sense if -uds
is combined with -securitytypes None
to allow authentication by access to the unix domain socket.)
I think that would make sense.
Would
-uds
also disable listening on TCP? (This might make sense if-uds
is combined with-securitytypes None
to allow authentication by access to the unix domain socket.)
-uds
would just be a shortcut for -rfbunixpath $vncUserDir/$host:$display.uds
, so yes, it would disable TCP connections. I don’t want -uds
to imply specific security types, because eventually I will need -securitytypes otp
in order to support the session manager. Also, if a user wants to allow collaboration, they will have to relax the permissions on the UDS, in which case they would want the collaborator to authenticate using an OTP or Unix credentials.
-uds
would just be a shortcut for-rfbunixpath $vncUserDir/$host:$display.uds
, so yes, it would disable TCP connections. I don’t want-uds
to imply specific security types, because eventually I will need-securitytypes otp
in order to support the session manager. Also, if a user wants to allow collaboration, they will have to relax the permissions on the UDS, in which case they would want the collaborator to authenticate using an OTP or Unix credentials.
That sounds good to me.
I am observing a couple of issues when using the feature with -via
rather than -tunnel
. When I do something like this:
/opt/TurboVNC/bin/vncviewer -via {host} localhost:/{uds_path_on_host} -extssh
I see two SSH connections being made (because the host's SSH banner appears twice, and if public key authentication is disabled, I have to enter my SSH password on the host twice.) I assume that is intended behavior but wanted to double check. The problem is that I get prompted to store the SSH key for localhost. This causes the host's SSH key to be stored as localhost
in the client's ~/.ssh/known_hosts file, which is incorrect and causes problems if you actually want to connect directly to localhost using SSH.
If I do the same thing without -extssh
, I similarly get prompted to store the SSH key for localhost, which gets added to the client's ~/.ssh/known_hosts file. Also, the SSH agent doesn't seem to work properly, and I get "Too many authentication failures" when the viewer attempts to make the second (inner) SSH connection. If I don't use public key authentication, then I get prompted for my SSH password on the host when the first (outer) SSH connection is made, then I get prompted for my SSH password on the client when the second (inner) SSH connection is made. Then the viewer fails with "bash: line 0: exec: socat: not found".
Also, I wanted to get a better understanding of the purpose of the threads in StreamDescriptor.java and Tunnel.createTunnelJSchUDS()
.
Since I'm trying to integrate this into the dev branch, which has changed considerably relative to main, it's entirely possible that one or more of the issues above were caused by integration errors. I have pushed a temporary branch called "dev.uds.wip" for you to review.
I am observing a couple of issues when using the feature with
-via
rather than-tunnel
. When I do something like this:/opt/TurboVNC/bin/vncviewer -via {host} localhost:/{uds_path_on_host} -extssh
I see two SSH connections being made (because the host's SSH banner appears twice, and if public key authentication is disabled, I have to enter my SSH password on the host twice.) I assume that is intended behavior but wanted to double check.
Yes, this was the intended behavior. (A connection to localhost
is being made, and this is tunneled over {host}
, similar to ssh -J {host} localhost
.)
The problem is that I get prompted to store the SSH key for localhost. This causes the host's SSH key to be stored as
localhost
in the client's ~/.ssh/known_hosts file, which is incorrect and causes problems if you actually want to connect directly to localhost using SSH.
I assumed people would be using
/opt/TurboVNC/bin/vncviewer {host}:/{uds_path_on_host} -extssh
in this case. But it might make sense to create a special case for localhost
, so that connecting to localhost
with -via {host}
is treated the same as connecting to {host}
directly.
Also, I wanted to get a better understanding of the purpose of the threads in StreamDescriptor.java and
Tunnel.createTunnelJSchUDS()
.
The first thread in StreamDescriptor.java
reads from inputStream
(which would be the InputStream
provided by channelExec.getInputStream()
, i.e. the stdout of the remote command) and writes the data into inBuffer
. This thread is needed because the only way (I could find, anyway) to find out when data is available from the InputStream
is to call read()
which will block, and therefore cannot be called on the main thread.
The second thread does the same thing for stdin of the remote command (and is needed because write()
might block if the remote command is consuming the provided data too slowly).
The thread in Tunnel.createTunnelJSchUDS()
copies the stderr of the remote command to the stderr of turbovnc (which also has to be in a separate thread because it blocks when calling read()
).
Since I'm trying to integrate this into the dev branch, which has changed considerably relative to main, it's entirely possible that one or more of the issues above were caused by integration errors. I have pushed a temporary branch called "dev.uds.wip" for you to review.
Ok, I'll take a look at the branch.
Yes, this was the intended behavior. (A connection to
localhost
is being made, and this is tunneled over{host}
, similar tossh -J {host} localhost
.)The problem is that I get prompted to store the SSH key for localhost. This causes the host's SSH key to be stored as
localhost
in the client's ~/.ssh/known_hosts file, which is incorrect and causes problems if you actually want to connect directly to localhost using SSH.I assumed people would be using
/opt/TurboVNC/bin/vncviewer {host}:/{uds_path_on_host} -extssh
in this case. But it might make sense to create a special case for
localhost
, so that connecting tolocalhost
with-via {host}
is treated the same as connecting to{host}
directly.
OK, I can confirm that ssh -J
does the same thing, so that's fair. It still needs to work properly without -extssh
, though. The issue with the JSch case seems to be that the inner SSH connection is made from the point of view of the client rather than from the point of view of the gateway ("jump host".)
Also, I wanted to get a better understanding of the purpose of the threads in StreamDescriptor.java and
Tunnel.createTunnelJSchUDS()
.The first thread in
StreamDescriptor.java
reads frominputStream
(which would be theInputStream
provided bychannelExec.getInputStream()
, i.e. the stdout of the remote command) and writes the data intoinBuffer
. This thread is needed because the only way (I could find, anyway) to find out when data is available from theInputStream
is to callread()
which will block, and therefore cannot be called on the main thread.
But blocking I/O isn't a problem when using TCP ports. Is that because the SSH connection is forked, using ssh -f
?
The second thread does the same thing for stdin of the remote command (and is needed because
write()
might block if the remote command is consuming the provided data too slowly).The thread in
Tunnel.createTunnelJSchUDS()
copies the stderr of the remote command to the stderr of turbovnc (which also has to be in a separate thread because it blocks when callingread()
).
Again, though, why is this only necessary with Unix domain sockets?
One problem I found with the current code is if the ssh connection is closed, turbovnc does not seem to notice that (this was working on the 3.0 branch). I have to check why that's the case.
OK, I can confirm that
ssh -J
does the same thing, so that's fair. It still needs to work properly without-extssh
, though. The issue with the JSch case seems to be that the inner SSH connection is made from the point of view of the client rather than from the point of view of the gateway ("jump host".)
Ok, I'll have to look into that problem.
But blocking I/O isn't a problem when using TCP ports. Is that because the SSH connection is forked, using
ssh -f
?
With TCP ports the SocketChannel
API (see https://docs.oracle.com/en/java/javase/19/docs/api/java.base/java/nio/channels/SocketChannel.html) can be used to check whether data is available (or can be sent) without blocking (which is done in SocketDescriptor.java
in turbovnc). When using the streams provided by JSch (from channelExec.getInputStream()
etc.) to access stdin / stdout, this is not possible.
The thread in
Tunnel.createTunnelJSchUDS()
copies the stderr of the remote command to the stderr of turbovnc (which also has to be in a separate thread because it blocks when callingread()
).Again, though, why is this only necessary with Unix domain sockets?
The forwarding over TCP uses setPortForwardingL()
(which internaly uses ChannelDirectTCPIP
in PortWatcher.java
) so there is no stderr channel.
The code in the SessionManager
class (which, like the Unix domain socket code, uses ChannelExec
) also has to deal with stderr (e.g. in SessionManager.getSessions()
, see e.g.: https://github.com/TurboVNC/turbovnc/blob/7cbab9560585907319d5bea3f123e331081a9600/java/com/turbovnc/vncviewer/SessionManager.java#L105 From what I can see the way the code currently works is that it reads stderr after reading stdout. I think there is a risk of a deadlock here, if this is not done in a separate thread: If the server sends a lot of data over stderr, flow control might kick in and the server might wait until the client starts reading the data from stderr. But if the client is waiting until stdout has been closed before reading from stderr, there is a deadlock.
With TCP ports the
SocketChannel
API (see https://docs.oracle.com/en/java/javase/19/docs/api/java.base/java/nio/channels/SocketChannel.html) can be used to check whether data is available (or can be sent) without blocking (which is done inSocketDescriptor.java
in turbovnc). When using the streams provided by JSch (fromchannelExec.getInputStream()
etc.) to access stdin / stdout, this is not possible.
Would any of that become more straightforward if we took advantage of Unix domain socket support in Java 16? Because the current plan is to embed Java 17 with TurboVNC 3.1, so using the built-in Unix domain socket API in Java should be possible. I remember that we discussed that, but I forgot what we concluded.
From what I can see the way the code currently works is that it reads stderr after reading stdout. I think there is a risk of a deadlock here, if this is not done in a separate thread: If the server sends a lot of data over stderr, flow control might kick in and the server might wait until the client starts reading the data from stderr. But if the client is waiting until stdout has been closed before reading from stderr, there is a deadlock.
I'll look into that.
Would any of that become more straightforward if we took advantage of Unix domain socket support in Java 16? Because the current plan is to embed Java 17 with TurboVNC 3.1, so using the built-in Unix domain socket API in Java should be possible. I remember that we discussed that, but I forgot what we concluded.
For connecting to a remote host, this would not help, as the problem is a limitation in the JSch API (when using -extssh=0
) or the Process
API (when using -extssh=1
). When connecting to a UDS on the local host, I think it would help, because then UnixDomainSocketAddress
+ SocketChannel
can be used, and SocketChannel
allows select()
to be used.
One problem I found with the current code is if the ssh connection is closed, turbovnc does not seem to notice that (this was working on the 3.0 branch). I have to check why that's the case.
The problem is that StreamDescriptor.read()
return -1 on EOF, but FdInStream.readWithTimeoutOrCallback
considers 0 to be EOF:
SocketDescriptor.read()
I think returns 0 on EOF: https://github.com/TurboVNC/turbovnc/blob/7cbab9560585907319d5bea3f123e331081a9600/java/com/turbovnc/network/SocketDescriptor.java#L86 (It turns the -1 which SocketChannel.read()
returns on EOF into 0).
So probably StreamDescriptor.read()
should also return 0 on EOF.
Another problem I saw was here: https://github.com/TurboVNC/turbovnc/blob/7cbab9560585907319d5bea3f123e331081a9600/java/com/turbovnc/vncviewer/Tunnel.java#L375
connectUDSDirect()
should not wait for the socat
process to finish.
In https://github.com/steffen-kiess/turbovnc/commit/d3e55fe8074c066322ebabf1eb2981be08c8f78a I fixed these problems. This commit also makes commands like vncviewer :/some/socket -via=some.host.example.com
(or equivalently vncviewer localhost:/some/socket -via=some.host.example.com
) work. It will now open an ssh connection to some.host.example.com
and will not attempt to open a second (inner) connection to localhost
anymore. This works both with -extssh=1
and -extssh=0
.
OK, thanks. I will look into all of that later this week. I think we should assume Java 16+ for local UDS connections, in order to simplify the code. I’ll look into that as well, but not until the current implementation is working properly and has been fully merged.
From what I can see the way the code currently works is that it reads stderr after reading stdout. I think there is a risk of a deadlock here, if this is not done in a separate thread: If the server sends a lot of data over stderr, flow control might kick in and the server might wait until the client starts reading the data from stderr. But if the client is waiting until stdout has been closed before reading from stderr, there is a deadlock.
Realistically, this shouldn't occur, because only a small amount of data is sent over stdout and stderr. Specifically, stdout is used to send the information about active TurboVNC sessions, and stderr is used to send the console output of /opt/TurboVNC/bin/vncserver
. That console output will not be more than a few lines. Unless I miss my guess, I do not think that such a deadlock could occur unless the user did something really stupid, such as installing some other script as /opt/TurboVNC/bin/vncserver.
In https://github.com/steffen-kiess/turbovnc/commit/d3e55fe8074c066322ebabf1eb2981be08c8f78a I fixed these problems. This commit also makes commands like vncviewer :/some/socket -via=some.host.example.com (or equivalently vncviewer localhost:/some/socket -via=some.host.example.com) work. It will now open an ssh connection to some.host.example.com and will not attempt to open a second (inner) connection to localhost anymore. This works both with -extssh=1 and -extssh=0.
That fixes the redundant double SSH connection, but it doesn't fix the issue whereby, when using JSch, the inner SSH connection is made from the point of view of the client rather than the jump host. Let's say that I'm connecting to a machine called turbovnc_host
that is only visible from a machine called gateway_host
. Attempting to do something like this:
/opt/TurboVNC/bin/vncviewer -via gateway_host turbovnc_host:/{uds_path}
results in an UnknownHostException. It works properly with -extssh
.
If it proves too difficult to make JSch work properly with this feature, then I am OK with deferring that functionality to a future release. With TurboVNC 3.1, I think it would be OK to require ExtSSH
in order to use Unix domain sockets remotely and Java 16+ in order to use Unix domain sockets locally. It appears that proper Unix domain socket support is at least on the JSch development radar.
Also, regarding the %
variables in the UDS path, I think it would make more sense to parse ~
as the home directory instead of %h
and then process any VNC display beginning with either /
or ~
as a UDS path instead of a TCP port. I am OK with leaving %u
and %i
as undocumented variables. Also, since the TurboVNC Server stores the UDS path in a known location for each VNC display, I think it would make sense to have special display designators in the viewer-- uds1
, uds2
, etc.-- that automatically map to the UDS path stored by the server for a particular VNC display. That way, a user could simply run /opt/TurboVNC/bin/vncserver -uds
on the host, which would report the display as {host}:uds{n}
, and they could then enter {host}:uds{n}
into the viewer. The full UDS path would be reported in the server log, as the TCP port currently is. It probably also makes sense for the viewer to parse the full UDS path only with a double colon rather than a single colon, to mirror the TCP functionality. As is the case with TCP, a user could allow the server to automatically determine a UDS path by passing -uds
, in which case they could enter a simplified display designator ({host}:uds{n}
), which maps to that UDS path, into the viewer. They could also pass -rfbunixpath
to the server, which would require them to enter the exact UDS path ({host}::{uds_path}
) into the viewer, much as they would have to enter {host}::{tcp_port}
into the viewer if they started the server with a non-default TCP port (-rfbport
.) I can make all of those changes myself but wanted to give you a chance to comment on them first.
The "dev.uds.wip" branch now contains the following changes:
-via
to use VNC_TUNNEL_CMD
when localhost
is specified as the VNC host than it is for -via
to make multiple SSH connections when localhost
is specified as the VNC host, and it's usually better to let advanced features remain advanced rather than try to second-guess the user. Users who are familiar with SSH will readily understand why multiple SSH connections are made, because VNC_VIA_CMD
contains ssh -J
.)ExtSSH
is now enabled when connecting to a remote Unix domain socket.~
in Unix domain socket paths, rather than %h
, now expands to the user's home directory on the VNC host, and a ~
or /
character following a double colon is assumed to be the start of a Unix domain socket path.:uds{n}
syntax described above, since I really need to understand how best to integrate Unix domain sockets with the TurboVNC Session Manager before I choose a more user-friendly syntax. Instead, I have repurposed the %h
variable, which now expands to the remote host name, so users can specify {host}::~/.vnc/%h_{display_number}.uds
in order to connect to the Unix domain socket that the TurboVNC Server creates by default when passed -uds
.Note also that I have observed a performance issue when using the Cygwin SSH client. This seems to be related to StreamDescriptor, because if I forward a local TCP port to the remote Unix domain socket, the performance is as expected. This can be reproduced with Cygwin SSH as follows:
This performs very slowly:
vncviewer.bat {host}::{uds_path} -securitytypes vncauth
This performs acceptably:
VNC_TUNNEL_CMD='ssh.exe -axf -L %L:%R %H sleep 20' vncviewer.bat {host}::{uds_path} -securitytypes vncauth
I don't observe the same performance issue with Mac or Linux clients, nor with the Microsoft-provided SSH client. Thus, it isn't a show-stopping issue, but I am mentioning it in case you might have any ideas regarding its cause.
Unless I hear back from you, the feature as implemented in "dev.uds.wip" will be pushed to dev later this week. (I still need to finish the documentation of it.)
Realistically, this shouldn't occur, because only a small amount of data is sent over stdout and stderr. Specifically, stdout is used to send the information about active TurboVNC sessions, and stderr is used to send the console output of
/opt/TurboVNC/bin/vncserver
. That console output will not be more than a few lines. Unless I miss my guess, I do not think that such a deadlock could occur unless the user did something really stupid, such as installing some other script as /opt/TurboVNC/bin/vncserver.
I think there might be some cases where e.g. the remote ssh server writes error messages to stderr, but I agree that probably in all realistic cases these will be too small to trigger a deadlock.
Also, regarding the
%
variables in the UDS path, I think it would make more sense to parse~
as the home directory instead of%h
and then process any VNC display beginning with either/
or~
as a UDS path instead of a TCP port.
Note that the home directory was %d
, not %h
(the %
codes match those documented in ssh_config(5)). I agree that ~
is more readable than %d
. However, it might be a good idea to only allow ~
at the beginning of the path. (A ~
not at the beginning of the path would not work anyway, and this way it is possible specify paths which actually contain a ~
, if needed. This also would match the behavior of unix shells, which only expand ~
at the beginning.) This can be achieved by using
if (i == 0 && udsPath.charAt(i) == '~') {
instead of
if (udsPath.charAt(i) == '~') {
in java/com/turbovnc/vncviewer/Tunnel.java
.
Also, currently a path connection to hostname::~dir/socket
will work, but attempt to connect to /home/usernamedir/socket
which probably is not what the user intended. May the check in Hostname.java
should be changed from substring.startsWith("::/") || substring.startsWith("::~")
to substring.startsWith("::/") || substring.startsWith("::~/")
.
I am OK with leaving
%u
and%i
as undocumented variables.
Ok. I like having %i
as it allows setting the path to something like /run/user/%i/some_socket
without knowning the numeric user ID on the remote host.
- Note that I chose not to integrate the change to CConn.java since, upon further reflection, that change made the viewer's behavior more confusing.
Ok.
As discussed above, Unix domain socket paths are now specified with a double colon rather than a single colon.
As discussed above,
~
in Unix domain socket paths, rather than%h
, now expands to the user's home directory on the VNC host, and a~
or/
character following a double colon is assumed to be the start of a Unix domain socket path.I have shied away from using the
:uds{n}
syntax described above, since I really need to understand how best to integrate Unix domain sockets with the TurboVNC Session Manager before I choose a more user-friendly syntax. Instead, I have repurposed the%h
variable, which now expands to the remote host name, so users can specify{host}::~/.vnc/%h_{display_number}.uds
in order to connect to the Unix domain socket that the TurboVNC Server creates by default when passed-uds
.
Ok. I agree that %h
is useful.
Note also that I have observed a performance issue when using the Cygwin SSH client. [...] I don't observe the same performance issue with Mac or Linux clients, nor with the Microsoft-provided SSH client. Thus, it isn't a show-stopping issue, but I am mentioning it in case you might have any ideas regarding its cause.
I'm afraid I don't, and since Windows now has a proper ssh client (and WSL exists) I'm not really using Cygwin anymore anyway.
Unless I hear back from you, the feature as implemented in "dev.uds.wip" will be pushed to dev later this week. (I still need to finish the documentation of it.)
From my side everything in the branch seems to be ok, except that (as mentioned above) I would restrict ~
expansion to the beginning of the string.
I will make the suggested change for parsing ~
. I didn’t realize that the %
variables were specified in the ssh_config man page. That makes things easier, because it means I can just refer users to that man page. I do notice, however, that %i
and %u
are supposed to expand to the “local” user ID and username. Should we use %r
instead? I’m not sure what “local” and “remote” means in the context of that man page.
I’m not sure what “local” and “remote” means in the context of that man page.
I suppose "local" means the host where the ssh
command is running and "remote" is the host where sshd
is running.
I do notice, however, that
%i
and%u
are supposed to expand to the “local” user ID and username. Should we use%r
instead?
Unfortunately ssh does not have a remote equivalent for %i
(it cannot get this value without logging in to the remote host, and these values are expanded before connecting, at least in some contexts). Note that vncviewer
anyway does not match the ssh semantic exactly (e.g. %h
in vncviewer
is now the remote host name as seen by the remote host, while in ssh
it refers to the remote host name as seen by the local host (which might differ, e.g. it might by fully qualified) so the semantics anyway would not be exactly the same.
Pushed to dev. Remaining things to do:
The idea behind Item 3 is that the Session Manager protocol, which is basically just an efficient way of communicating TurboVNC session information through stdout/SSH, could be extended so that it lists Unix domain socket paths for running sessions, and the Session Manager could seamlessly set up the SSH tunnel to a remote UDS rather than to a remote TCP port. However, since the Session Manager is designed to be self-contained, it really needs JSch. Technically speaking, the Session Manager doesn't need multi-level SSH tunnels, so it should be possible to support UDS connections with it now, but if we say that UDS connections work with JSch, then that needs to be true for all operational modes.
Would any of that become more straightforward if we took advantage of Unix domain socket support in Java 16? Because the current plan is to embed Java 17 with TurboVNC 3.1, so using the built-in Unix domain socket API in Java should be possible. I remember that we discussed that, but I forgot what we concluded.
For connecting to a remote host, this would not help, as the problem is a limitation in the JSch API (when using
-extssh=0
) or theProcess
API (when using-extssh=1
). When connecting to a UDS on the local host, I think it would help, because thenUnixDomainSocketAddress
+SocketChannel
can be used, andSocketChannel
allowsselect()
to be used.
In both cases, StreamSocket is used. So if built-in UDS support in Java 16 wouldn't help the SSH case, then I don't see how it could reduce code complexity at all. Am I missing something?
In both cases, StreamSocket is used. So if built-in UDS support in Java 16 wouldn't help the SSH case, then I don't see how it could reduce code complexity at all. Am I missing something?
No, I don't think so. StreamSocket will be needed for the ssh case, only the local case might avoid it, so this won't simplify the code.
However, using the built-in UDS support will avoid the dependency on socat
. With JSch (if it has Unix Domain support) it also might be possible to avoid the remote dependency on socat. (But then, implementing the ~
and %
expansion will be more difficult, but still should be possible by executing commands like id
remotely before connecting to the Unix Domain socket).
This is https://github.com/TurboVNC/turbovnc/pull/294 rebased to main.