Xpra-org / xpra

Persistent remote applications for X11; screen sharing for X11, MacOS and MSWindows.
https://xpra.org/
GNU General Public License v2.0
1.94k stars 167 forks source link

Invalid response - ERR_INVALID_HTTP_RESPONSE #3889

Closed jhgoebbert closed 9 months ago

jhgoebbert commented 1 year ago

Describe the bug Sometimes Xpra does not deliver the requested HTML files (index.html) or delivers corrupted ones.

To Reproduce

Then I can see ...

System Information (please complete the following information):

Additional context image image image

jhgoebbert commented 1 year ago

The real case scenario is that our login nodes are crowded with people and xpra has slow access to IO or compute. Instead of waiting for the file requested by the browser it seems to me that xpra gives up and sends an index.html with size 0byte.
I will try to prove that with the logs.

jhgoebbert commented 1 year ago

These are the Xpra logs when the browser receives a 0byte index.html:

^[[32m2023-06-08 14:37:04,175 flush_then_close: wait_for_packet_sent() close_and_release()^[[0m
^[[32m2023-06-08 14:37:04,175 Protocol.close(None) closed=True, connection=None^[[0m
^[[32m2023-06-08 14:37:04,175 flush_then_close: done, callback=None^[[0m
^[[32m2023-06-08 14:37:04,194 untilConcludes(<bound method Connection.is_active of tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50900>, <bound method Connection.can_retry of tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50900>, <built-in method recv of socket object at 0x7f07e7fe0400>, (65536,), {}) timed out, retry=socket.timeout^[[0m
^[[32m2023-06-08 14:37:04,247 io_in_cb(<socket.socket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('0.0.0.0', 12122)>, <flags G_IO_IN of type GLib.IOCondition>)^[[0m
^[[32m2023-06-08 14:37:04,247 read_parse_thread_loop starting^[[0m
^[[32m2023-06-08 14:37:04,248 _new_connection(<socket.socket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('0.0.0.0', 12122)>, 'tcp', 0)^[[0m
^[[32m2023-06-08 14:37:04,248 invalid header: b'GET /index.html HTTP/1.1\r\nHost: localhost:12 .. ed-Since: Wed, 07 Jun 2023 18:55:42 GMT\r\n\r\n', input_packetcount=0, html=True, ssl=True^[[0m
^[[32m2023-06-08 14:37:04,248 invalid_header(Protocol(tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50900), 817 bytes: 'invalid packet header byte', b'GET /index.html HTTP/1.1\r\nHost: localhost:12 .. ed-Since: Wed, 07 Jun 2023 18:55:42 GMT\r\n\r\n')^[[0m
^[[32m2023-06-08 14:37:04,249 SocketConnection(<socket.socket fd=20, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 12122), raddr=('127.0.0.1', 50914)>, ('127.0.0.1', 12122), ('127.0.0.1', 50914), ('127.0.0.1', 50914), 'tcp', None, {})^[[0m
^[[32m2023-06-08 14:37:04,249 Connection(('127.0.0.1', 50914), 'tcp', None, {})^[[0m
^[[32m2023-06-08 14:37:04,249 tcp options: cork=True, nodelay=None^[[0m
^[[32m2023-06-08 14:37:04,249 accept_connection(<socket.socket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('0.0.0.0', 12122)>, tcp, 0.1)=tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914^[[0m
^[[32m2023-06-08 14:37:04,250 process default packet gibberish^[[0m
^[[32m2023-06-08 14:37:04,250 Received uninterpretable nonsense from Protocol(tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50900): invalid packet header byte: http^[[0m
^[[32m2023-06-08 14:37:04,250  data: b'GET /index.html HTTP/1.1\r\nHost: localhost:12 .. ed-Since: Wed, 07 Jun 2023 18:55:42 GMT\r\n\r\n'^[[0m
^[[32m2023-06-08 14:37:04,250 disconnect_client(Protocol(tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50900), invalid packet header byte: http, ())^[[0m
^[[32m2023-06-08 14:37:04,250 disconnect_protocol(Protocol(tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50900), ('invalid packet header byte: http',))^[[0m
^[[32m2023-06-08 14:37:04,250 do_get_socket_info(<socket.socket fd=19, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 12122), raddr=('127.0.0.1', 50900)>) fd=19, local=('127.0.0.1', 12122)^[[0m
^[[32m2023-06-08 14:37:04,251 getsockopt(SOL_TCP, TCP_INFO, 232)={'state': 1, 'ca_state': 0, 'retransmits': 0, 'probes': 0, 'backoff': 0, 'options': 7, 'snd_wscale': 9, 'rcv_wscale': 9, 'rto': 212000, 'ato': 40000, 'snd_mss': 22016, 'rcv_mss': 817, 'unacked': 0, 'sacked': 0, 'lost': 0, 'retrans': 0, 'fackets': 0, 'last_data_sent': 5135, 'last_ack_sent': 0, 'last_data_recv': 5, 'last_ack_recv': 5, 'pmtu': 65535, 'rcv_ssthresh': 45324, 'rtt': 62, 'rttvar': 31, 'snd_ssthresh': 2147483647, 'snd_cwnd': 10, 'advmss': 65483, 'reordering': 3, 'rcv_rtt': 5130000, 'rcv_space': 43690, 'total_retrans': 0, 'pacing_rate': 18446744073709551615, 'max_pacing_rate': 18446744073709551615, 'bytes_acked': 0, 'bytes_received': 817, 'segs_out': 1, 'segs_in': 3, 'notsent_bytes': 0, 'min_rtt': 62, 'data_segs_in': 1, 'data_segs_out': 0, 'delivery_rate': 0, 'busy_time': 0, 'rwnd_limited': 0, 'sndbuf_limited': 0, 'delivered': 1, 'delivered_ce': 0, 'bytes_sent': 0, 'bytes_retrans': 0, 'dsack_dups': 0, 'reord_seen': 0, 'rcv_ooopack': 0, 'snd_wnd': 44032}^[[0m
^[[32m2023-06-08 14:37:04,251 handle_new_connection(tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914, ('0.0.0.0', 12122), {}) sockname=('127.0.0.1', 12122), target=('127.0.0.1', 50914)^[[0m
^[[32m2023-06-08 14:37:04,252 get_tcp_info(<socket.socket fd=19, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 12122), raddr=('127.0.0.1', 50900)>)={'state': 1, 'ca_state': 0, 'retransmits': 0, 'probes': 0, 'backoff': 0, 'options': 7, 'snd_wscale': 9, 'rcv_wscale': 9, 'rto': 212000, 'ato': 40000, 'snd_mss': 22016, 'rcv_mss': 817, 'unacked': 0, 'sacked': 0, 'lost': 0, 'retrans': 0, 'fackets': 0, 'last_data_sent': 5135, 'last_ack_sent': 0, 'last_data_recv': 5, 'last_ack_recv': 5, 'pmtu': 65535, 'rcv_ssthresh': 45324, 'rtt': 62, 'rttvar': 31, 'snd_ssthresh': 2147483647, 'snd_cwnd': 10, 'advmss': 65483, 'reordering': 3, 'rcv_rtt': 5130000, 'rcv_space': 43690, 'total_retrans': 0, 'pacing_rate': 18446744073709551615, 'max_pacing_rate': 18446744073709551615, 'bytes_acked': 0, 'bytes_received': 817, 'segs_out': 1, 'segs_in': 3, 'notsent_bytes': 0, 'min_rtt': 62, 'data_segs_in': 1, 'data_segs_out': 0, 'delivery_rate': 0, 'busy_time': 0, 'rwnd_limited': 0, 'sndbuf_limited': 0, 'delivered': 1, 'delivered_ce': 0, 'bytes_sent': 0, 'bytes_retrans': 0, 'dsack_dups': 0, 'reord_seen': 0, 'rcv_ooopack': 0, 'snd_wnd': 44032, 'sndbuf_size': 2626560, 'sndbuf_unacked': 0, 'sndbuf_bytes': 0}^[[0m
^[[32m2023-06-08 14:37:04,252 peek_connection(tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914, 1000, 8192)^[[0m
2023-06-08 14:37:04,252 Disconnecting client 127.0.0.1:50900:^[[0m
^[[32m2023-06-08 14:37:04,252 set_socket_timeout(tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914, 1.0) applied to <socket.socket fd=20, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 12122), raddr=('127.0.0.1', 50914)>^[[0m
2023-06-08 14:37:04,252  invalid packet header byte: http^[[0m
^[[32m2023-06-08 14:37:04,252 <built-in method recv of socket object at 0x7f07e7ff09a0>(8192, MSG_PEEK)^[[0m
^[[32m2023-06-08 14:37:04,253 flush_then_close(['disconnect', 'invalid packet header byte: http'], None) closed=False^[[0m
^[[32m2023-06-08 14:37:04,253 flush_then_close: wait_for_write_lock()^[[0m
^[[32m2023-06-08 14:37:04,253 flush_then_close: acquired the write lock^[[0m
^[[32m2023-06-08 14:37:04,253 flush_then_close: queue is now empty, sending the last packet and closing^[[0m
^[[32m2023-06-08 14:37:04,253 packet type send alias not found for 'disconnect'^[[0m
^[[32m2023-06-08 14:37:04,253 sending 45 bytes without header^[[0m
^[[32m2023-06-08 14:37:04,253 raw_write for disconnect, starting write thread^[[0m
^[[32m2023-06-08 14:37:04,253 io_thread_loop(write, <bound method Protocol._write of Protocol(tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50900)>) loop starting^[[0m
^[[32m2023-06-08 14:37:04,253 cleanup_protocol(Protocol(tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50900))^[[0m
^[[32m2023-06-08 14:37:04,254 changed tcp socket to nodelay=True^[[0m
^[[32m2023-06-08 14:37:04,254 flush_then_close: packet_queued() closed=False^[[0m
^[[32m2023-06-08 14:37:04,254 flush_then_close: wait_for_packet_sent() queue.empty()=True, closed=False^[[0m
^[[32m2023-06-08 14:37:04,254 flush_then_close: wait_for_packet_sent() close_and_release()^[[0m
^[[32m2023-06-08 14:37:04,254 Protocol.close(None) closed=False, connection=tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50900^[[0m
^[[32m2023-06-08 14:37:04,254 Protocol.close(None) calling <bound method SocketConnection.close of tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50900>^[[0m
^[[32m2023-06-08 14:37:04,254 process default packet connection-lost^[[0m
^[[32m2023-06-08 14:37:04,254 tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50900.close() socket=<socket.socket fd=19, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 12122), raddr=('127.0.0.1', 50900)>^[[0m
^[[32m2023-06-08 14:37:04,254 process_connection_lost(Protocol(None), ['connection-lost'])^[[0m
^[[32m2023-06-08 14:37:04,255 tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50900.close() done^[[0m
^[[32m2023-06-08 14:37:04,255 cleanup_protocol(Protocol(None))^[[0m
^[[32m2023-06-08 14:37:04,255 terminate_queue_threads()^[[0m
^[[32m2023-06-08 14:37:04,255 Protocol.close(None) done^[[0m
^[[32m2023-06-08 14:37:04,255 flush_then_close: done, callback=None^[[0m
^[[32m2023-06-08 14:37:04,255 io_thread_loop(write, <bound method Protocol._write of Protocol(None)>) loop ended, closed=True^[[0m
^[[32m2023-06-08 14:37:04,347 can_retry: <class 'OSError'>, args=(9, 'Bad file descriptor'), errno=9, code=9, abort=9^[[0m
^[[32m2023-06-08 14:37:04,347 None closed in read loop
Traceback (most recent call last):
  File "/p/usersoftware/swmanage/goebbert1/stage2023/JupyterLab/easybuild/juwels/software/xpra/4.4.5-GCCcore-11.3.0/lib/python3.10/site-packages/xpra-4.4.5-py3.10-linux-x86_64.egg/xpra/net/protocol.py", line 690, in _io_thread_loop
    while not self._closed and callback():
  File "/p/usersoftware/swmanage/goebbert1/stage2023/JupyterLab/easybuild/juwels/software/xpra/4.4.5-GCCcore-11.3.0/lib/python3.10/site-packages/xpra-4.4.5-py3.10-linux-x86_64.egg/xpra/net/protocol.py", line 779, in _read
    buf = self.con_read()
  File "/p/usersoftware/swmanage/goebbert1/stage2023/JupyterLab/easybuild/juwels/software/xpra/4.4.5-GCCcore-11.3.0/lib/python3.10/site-packages/xpra-4.4.5-py3.10-linux-x86_64.egg/xpra/net/protocol.py", line 797, in con_read
    return self._conn.read(self.read_buffer_size)
  File "/p/usersoftware/swmanage/goebbert1/stage2023/JupyterLab/easybuild/juwels/software/xpra/4.4.5-GCCcore-11.3.0/lib/python3.10/site-packages/xpra-4.4.5-py3.10-linux-x86_64.egg/xpra/net/bytestreams.py", line 345, in read
    return self._read(self._socket.recv, n)
  File "/p/usersoftware/swmanage/goebbert1/stage2023/JupyterLab/easybuild/juwels/software/xpra/4.4.5-GCCcore-11.3.0/lib/python3.10/site-packages/xpra-4.4.5-py3.10-linux-x86_64.egg/xpra/net/bytestreams.py", line 162, in _read
    r = self.untilConcludes(*args)
  File "/p/usersoftware/swmanage/goebbert1/stage2023/JupyterLab/easybuild/juwels/software/xpra/4.4.5-GCCcore-11.3.0/lib/python3.10/site-packages/xpra-4.4.5-py3.10-linux-x86_64.egg/xpra/net/bytestreams.py", line 147, in untilConcludes
    return untilConcludes(self.is_active, self.can_retry, *args)
  File "/p/usersoftware/swmanage/goebbert1/stage2023/JupyterLab/easybuild/juwels/software/xpra/4.4.5-GCCcore-11.3.0/lib/python3.10/site-packages/xpra-4.4.5-py3.10-linux-x86_64.egg/xpra/net/bytestreams.py", line 82, in untilConcludes
    retry = can_retry_cb(e)
  File "/p/usersoftware/swmanage/goebbert1/stage2023/JupyterLab/easybuild/juwels/software/xpra/4.4.5-GCCcore-11.3.0/lib/python3.10/site-packages/xpra-4.4.5-py3.10-linux-x86_64.egg/xpra/net/bytestreams.py", line 144, in can_retry
    return can_retry(e)
  File "/p/usersoftware/swmanage/goebbert1/stage2023/JupyterLab/easybuild/juwels/software/xpra/4.4.5-GCCcore-11.3.0/lib/python3.10/site-packages/xpra-4.4.5-py3.10-linux-x86_64.egg/xpra/net/bytestreams.py", line 72, in can_retry
    raise ConnectionClosedException(e) from None
xpra.net.common.ConnectionClosedException: [Errno 9] Bad file descriptor^[[0m
^[[32m2023-06-08 14:37:05,251 connection lost: invalid packet header byte: http^[[0m
^[[32m2023-06-08 14:37:05,251 Protocol.close(invalid packet header byte: http) closed=True, connection=None^[[0m
^[[32m2023-06-08 14:37:05,254 peek_connection(tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914, 1000) failed
Traceback (most recent call last):
  File "/p/usersoftware/swmanage/goebbert1/stage2023/JupyterLab/easybuild/juwels/software/xpra/4.4.5-GCCcore-11.3.0/lib/python3.10/site-packages/xpra-4.4.5-py3.10-linux-x86_64.egg/xpra/net/socket_util.py", line 196, in peek_connection
    peek_data = conn.peek(size)
  File "/p/usersoftware/swmanage/goebbert1/stage2023/JupyterLab/easybuild/juwels/software/xpra/4.4.5-GCCcore-11.3.0/lib/python3.10/site-packages/xpra-4.4.5-py3.10-linux-x86_64.egg/xpra/net/bytestreams.py", line 342, in peek
    return self._socket.recv(n, socket.MSG_PEEK)
TimeoutError: timed out^[[0m
^[[32m2023-06-08 14:37:05,505 peek: elapsed=1252, timeout=1000^[[0m
^[[32m2023-06-08 14:37:05,505 socket tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914 peek: got 0 bytes^[[0m
^[[32m2023-06-08 14:37:05,505 socket peek=b''^[[0m
^[[32m2023-06-08 14:37:05,506 socket peek hex=^[[0m
^[[32m2023-06-08 14:37:05,506 socket peek line1=b''^[[0m
^[[32m2023-06-08 14:37:05,506 guess_packet_type(..)=None^[[0m
^[[32m2023-06-08 14:37:05,506 log_new_connection(tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914, ('0.0.0.0', 12122)) type=<class 'xpra.net.bytestreams.SocketConnection'>, sock=<socket.socket fd=20, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 12122), raddr=('127.0.0.1', 50914)>, sockname=('127.0.0.1', 12122), address=('127.0.0.1', 50914), peername=('127.0.0.1', 50914)^[[0m
2023-06-08 14:37:05,506 New tcp connection received^[[0m
2023-06-08 14:37:05,506  from '127.0.0.1:50914'^[[0m
2023-06-08 14:37:05,506  on '0.0.0.0:12122'^[[0m
^[[32m2023-06-08 14:37:05,506 make_protocol('tcp', tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914, {}, <function ServerCore.make_protocol.<locals>.xpra_protocol_class at 0x7f07e7f8f7f0>, None)^[[0m
^[[32m2023-06-08 14:37:05,507 tcp: encryption=, keyfile=^[[0m
^[[32m2023-06-08 14:37:05,507 socktype=tcp, encryption=, keyfile=^[[0m
^[[32m2023-06-08 14:37:05,507 io_thread_loop(read, <bound method Protocol._read of Protocol(tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914)>) loop starting^[[0m
^[[32m2023-06-08 14:37:05,608 untilConcludes(<bound method Connection.is_active of tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914>, <bound method Connection.can_retry of tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914>, <built-in method recv of socket object at 0x7f07e7ff09a0>, (65536,), {}) timed out, retry=socket.timeout^[[0m
^[[32m2023-06-08 14:37:05,708 untilConcludes(<bound method Connection.is_active of tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914>, <bound method Connection.can_retry of tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914>, <built-in method recv of socket object at 0x7f07e7ff09a0>, (65536,), {}) timed out, retry=socket.timeout^[[0m
^[[32m2023-06-08 14:37:05,809 untilConcludes(<bound method Connection.is_active of tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914>, <bound method Connection.can_retry of tcp socket: 127.0.0.1:12122 <- 127.0.0.1:50914>, <built-in method recv of socket object at 0x7f07e7ff09a0>, (65536,), {}) timed out, retry=socket.timeout^[[0m
totaam commented 1 year ago

and ssh-tunneling the port to my local laptop.

That's the problem, ssh adds buffering. Don't do that, or use bind-ws / bind-wss instead of bind-tcp and bind-ssl. Another solution is to not use xpra as a web server and host the html5 client on a real web server.

jhgoebbert commented 1 year ago

I understand your point 🤔
... and so I tried with bind-wss (through an nginx proxy). It was much less likely that xpra provided 0-sized js/html files to the browser, but it still happened: image

totaam commented 1 year ago

If you have an nginx proxy, then let nginx serve the html files.

jhgoebbert commented 1 year ago

Sure, that would be a good work-around - but unfortunately not a good one for our setup :thinking:
We are serving different web-applications running on our HPC systems though a JupyterHub/Jupyter-Server-Proxy combination (hosted on our cloud) for authentication/authorization/HPCforwarding. Xpra runs on the HPC system (compute or login) but the nginx on the cloud.

I have not seen these issues yet with other webapps (like JupyterLab - based on tornado). So I was wondering what is done different there.

totaam commented 1 year ago

This can definitely be fixed, but I just don't have the time for it. Sorry.

jhgoebbert commented 1 year ago

Thank you for your answer. I understand the problem now much better and I can image that a fix can be complex/time-consuming.

And of course: Thank you also for the great Xpra! It replaced VNC at our place almost completely.

totaam commented 1 year ago

@jhgoebbert Thanks. As a corporate user, have you considered: https://github.com/Xpra-org/xpra/blob/master/docs/SPONSORS.md

jhgoebbert commented 1 year ago

Good you point that out. I will pass this on.

totaam commented 9 months ago

I can reproduce the problem without shortening the server timeout using:

from time import sleep
import socket

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect(("127.0.0.1", 10000))
    sleep(6)
    s.sendall(b"GET /echo HTTP/1.1\n\n")
    data = s.recv(1024)

print(f"Received {data!r}")

Which prints: Received b'disconnect: invalid packet header byte: http\n'


On the server side:

accept_connection(<socket.socket fd=6, family=2, type=1, proto=0, laddr=('0.0.0.0', 10000)>, tcp, 0.1)=tcp socket: 127.0.0.1:10000 <- 127.0.0.1:49526
(...)
socket peek line1=b''
(...)
New tcp connection received
(...)
untilConcludes(<bound method Connection.is_active of tcp socket: 127.0.0.1:10000 <- 127.0.0.1:49526>, <bound method Connection.can_retry of tcp socket: 127.0.0.1:10000 <- 127.0.0.1:49526>, <built-in method recv of socket object at 0x7f18a85d86e0>, (65536,), {}) timed out, retry=socket.timeout
(...)
read_parse_thread_loop starting
invalid header: b'GET /echo HTTP/1.1\n\n', input_packetcount=0, websocket_upgrade=True, ssl=True
invalid_header(Protocol(tcp socket: 127.0.0.1:10000 <- 127.0.0.1:49526), 20 bytes: 'invalid packet header byte', b'GET /echo HTTP/1.1\n\n')
process default packet gibberish
totaam commented 9 months ago

I'm not sure if this is the correct approach yet, for sure the code needs tidying up, but the commit above does solve the immediate problem and looks back-portable to v5.

TODO:

totaam commented 9 months ago

@jhgoebbert please try the latest beta builds, 6.0-r34912 and later have this fix.