FDH2 / UxPlay

AirPlay Unix mirroring server
GNU General Public License v3.0
1.35k stars 72 forks source link

After ~20 minutes my screen mirroring freeze forever #58

Closed k0ste closed 2 years ago

k0ste commented 2 years ago

Hi, just started to use uxplay today, 1.46, then compile from git

The command is:

uxplay -s 3840x2160 -vs "waylandsink fullscreen=true" -fps 60 -as 0 -d

The problem is: after ~20 minutes screen mirror (Wayland window on Linux) freeze and on debug I see this:

POST /feedback RTSP/1.0
CSeq: 1066
DACP-ID: 92E04835971C9CF6
Active-Remote: 2323908967
User-Agent: AirPlay/600.8.1

Handling request POST with URL /feedback
raop_handler_feedback

RTSP/1.0 200 OK
CSeq: 1066
Server: AirTunes/220.68

raop_ntp send_len = 32
raop_ntp receive time type_t packetlen = 32
raop_ntp sync correction = 0
raop_ntp send_len = 32
raop_ntp receive timeout
raop_ntp send_len = 32
raop_ntp receive timeout
raop_ntp send_len = 32
raop_ntp receive timeout
raop_ntp send_len = 32
raop_ntp receive timeout
raop_ntp send_len = 32
raop_ntp receive timeout
raop_ntp send_len = 32
raop_ntp receive timeout
raop_ntp send_len = 32
raop_ntp receive timeout
raop_ntp send_len = 32
raop_ntp receive timeout
raop_ntp send_len = 32
raop_ntp receive timeout
raop_ntp send_len = 32
raop_ntp receive timeout

Mac machine in this case don't drop connection After some time RTP seems work again, but screen freezes for forever

Reconnect (e.g. disable mirroring on Mac and reenable) - impossible, MacOS just can't connect again, restart of uxplay resolve's this problem

Any ideas what is wrong? Power savings on Linux machine was disabled (for Wi-Fi module and for PC in Gnome settings)

fduncanh commented 2 years ago

No idea. POST/feedback is the client checking if server is "alive", but it replied OK which version of uxplay, its it the latest from github? 1.46?

  1. dont use -fps 60
  2. try an older version from the releases to see if this is something new.
k0ste commented 2 years ago

which version of uxplay, its it the latest from github?

Yep, this log from latest from git I will try to use again tomorrow, with your recommendations👨‍✈️

fduncanh commented 2 years ago

The ios/macos client is not responding to the server. May be a problem on the client?

this is happening near line 260 in lib/raop_ntp.c

This is maintaining the relative timing of packets sent and received. Good luck in troubleshooting.

k0ste commented 2 years ago

I was try without -fps 60, seems without this the render window works longer, but still freezes uxplay-git.log

Also tried version 1.45 - also get freeze uxplay.1.45.log

Should I try to downgrade deeper?

fduncanh commented 2 years ago

It seems uxplay replies to the client's POST/feedback request (made periodically to check the server is still alive) but the client stops hearing it.

Please describe the client (is it ios or macos, what hardware ) and the server host (which linux (or other os) , what hardware)

  1. to rule out a problem with the client, can you test with a different client.
  2. to rule out some error I have introduced, please test with https://github.com/FDH2/UxPlay/tree/00865c9023ae898a8fd93fe72bce2fab34072965 (this is from August 7, 2021 before I did very many changes) Click on "Code" to get a zip download.
fduncanh commented 2 years ago

I just streamed a youtube movie from a macbook pro running Catalina macOS 10.15.7 to uxplay on a linux desktop machine. Its been running now for 90 mins with no issues.

Your debug output ends in a TEARDOWN (stop accepting streaming) request from the client. Is this due to you manually selecting Stop Mirroring on the client, or did it produce the request by itself?

I also see no audio was sent. Are you just mirroring you mac screen?

k0ste commented 2 years ago

Please describe the client (is it ios or macos, what hardware ) and the server host (which linux (or other os) , what hardware)

MacBook Pro (13-inch, 2019, Two Thunderbolt 3 ports): macOS Monterey 12.1 (21C52)

CPU: 1,4 GHz Quad-Core Intel Core i5
RAM: 16 GB 2133 MHz LPDDR3
Kernel: Darwin 192.168.1.4 21.2.0 Darwin Kernel Version 21.2.0: Sun Nov 28 20:28:54 PST 2021; root:xnu-8019.61.5~1/RELEASE_X86_64 x86_64

Dell XPS 13 (9380):

CPU: Intel(R) Core(TM) i7-8565U
RAM: 16 GB 2133 MHz LPDDR3
Kernel: Linux WorkStation 5.16.2-arch1-1 #1 SMP PREEMPT Thu, 20 Jan 2022 16:18:29 +0000 x86_64 GNU/Linux
k0ste commented 2 years ago

I just streamed a youtube movie from a macbook pro running Catalina macOS 10.15.7 to uxplay on a linux desktop machine. Its been running now for 90 mins with no issues.

I stream only single window of Telegram messenger (App Store version). This just a "second screen for messenger" I will try to stream youtube, e.g. more dynamic, more frames

Your debug output ends in a TEARDOWN (stop accepting streaming) request from the client. Is this due to you manually selecting Stop Mirroring on the client, or did it produce the request by itself?

As I was say initially, MacOS thinks the connection is live, the window works on second screen, the pointer can be moved to second screen, the uxplay server thread seems works also, but window don't render new frames

I also see no audio was sent. Are you just mirroring you mac screen?

Yes, just a single maximazied window

k0ste commented 2 years ago

00865c9023ae898a8fd93fe72bce2fab34072965 - build from this commit seems very latent. When I make a move over touchpad I will see reaction only after ~ 1 second. Is this behavior expected?

fduncanh commented 2 years ago

00865c9 is old, I don't recommend it. (but no, your observed latency is unexpected) It was just to see if your problem was something introduced after it. I see no latency (with latest uxplay) on video from moving pointer on macbook and on the image on the mirror scree (I am testing right now, with no audo or video on the macbook, just a working screen) . Its been good for 25 mins now.

Should I test using uxplay as a "second display" for the macbook (is that what your use case is?)

k0ste commented 2 years ago

00865c9023ae898a8fd93fe72bce2fab34072965 is old, I dont recomment it. It was just to see if your problem was something introduced after it.

Ok, I just tested it with only -s option, seems is too old to support -vs "waylandsink fullscreen=true" option Just for the record, what I mean about latency: was try to record video with build from this commit - youtube

And this build also buggy for me, the log uxplay_00865c9023ae898a8fd93fe72bce2fab34072965.log

Should I test using uxplay as a "second display" for the macbook (is that what your use case is?)

This is exactly the case

fduncanh commented 2 years ago

Use uxplay as second display seems to be working well for me, pointer moves between the two screens (on mac and on linux) with no problem. Over 20 mins now

fduncanh commented 2 years ago

Yes , no -vs option on the old version. You didnt answer about the end of the debug straces. Did you manually terminate from the macos client , or did the client do it automatically. You 00865c debug trace ends with a TEARDOWN request from the client. This is a request to uxplay to stop mirroringt,

raop_rtp_mirror video ntp = 1643183103301150, now = 1643183103243205, latency = -57945
raop_rtp_mirror video ntp = 1643183103317817, now = 1643183103259684, latency = -58133
raop_rtp_mirror video ntp = 1643183103334483, now = 1643183103278394, latency = -56089
raop_rtp_mirror video ntp = 1643183103351150, now = 1643183103291105, latency = -60045
raop_rtp_mirror video ntp = 1643183103367817, now = 1643183103310690, latency = -57127
raop_rtp_mirror video ntp = 1643183103384483, now = 1643183103325129, latency = -59354   <--still working
raop_ntp send_len = 32
raop_ntp receive timeout               1
raop_ntp send_len = 32
raop_ntp receive timeout               2
raop_ntp send_len = 32
raop_ntp receive timeout               3
raop_ntp send_len = 32
raop_ntp receive timeout                4
raop_ntp send_len = 32
raop_ntp receive timeout               5
raop_ntp send_len = 32
raop_ntp receive timeout             6
raop_ntp send_len = 32
raop_ntp receive timeout             7
raop_ntp send_len = 32
raop_ntp receive timeout             8
raop_ntp send_len = 32
raop_ntp receive timeout             9
raop_ntp send_len = 32
raop_ntp receive timeout           10                <--- Hmm.   does uxplay  give up after 10 timeouts when it sends
raop_ntp send_len = 32                                                    something to the client without geeting a response?
ra
Exiting HTTP thread                                       <------------? giving up (look for this in code) line 368 in lib/httpd.c
op_ntp receive timeout
raop_ntp send_len = 32
raop_ntp receive time type_t packetlen = 32
raop_ntp sync correction = 0
httpd receiving on socket 25
conn_request
Handling request POST with URL /feedback
raop_handler_feedback
httpd receiving on socket 25
conn_request
Handling request TEARDOWN with URL rtsp://192.168.1.11/2318701260466505035    <<<<TEARDOWN  request
raop_rtp_mirror exiting TCP thread
httpd receiving on socket 25
conn_request
Handling request TEARDOWN with URL rtsp://192.168.1.11/2318701260466505035
httpd receiving on socket 25
Connection closed for socket 25
Destroying connection
raop_ntp stopping time thread
raop_ntp exiting thread
raop_ntp stopped time thread
httpd receiving on socket 27
Connection closed for socket 27
Destroying connection
Stopping...
fduncanh commented 2 years ago

Exiting HTTP thread is in lib/httpd.c which handles client-server communication. Maybe you can take a look and see if you can work out whats going on.

I see no issue in using uxplay as second display for my mac after over 30 mins.

k0ste commented 2 years ago

Did you manually terminate from the macos client , or did the client do it automatically.

Yes, I do it by hand after window freeze

I see no issue in using uxplay as second display for my mac after over 30 mins.

I perform another test - run youtube campfire, this test also works until window freeze, the log uxplay.log

fduncanh commented 2 years ago

try doing a debug where you terminate with ctrl-c on linux as soon as the free occurs. (keep moving the pointer till you see it freeze)

At least this will show what is happening when the freeze occurs.

k0ste commented 2 years ago

Maybe we need some datetime on logger? May be this is some timeouts just we don't see

try doing a debug where you terminate with ctrl-c on linux as soon as the free occurs. (keep moving the pointer till you see it freeze)

At least this will show what is happening when the freeze occurs.

Currently I just forward all stdout to file. I will try to perform test like this later...

k0ste commented 2 years ago

try doing a debug where you terminate with ctrl-c on linux as soon as the free occurs.

Done. Watched youtube, after some time picture is freeze, but sound continue playing without issues, the log before Ctrl+C uxplay.log

fduncanh commented 2 years ago

If the video freezes, but the audio continues, the communications between the macos client and uxplay are continuing fine.

"video freezes" just means that no new video frames are being sent to redraw the screen, so it will keep showing the last received video frame.

timestamps are shown in the "now = " entry in debug (I think this is microseconds since unix epoch = 1970-01-01.)

So video frames just stop arriving, with nothing showing in the debug.

raop just decrypts video and audio and sends frames to gstreamer. It doesn't know whether or not gstreamer actually renders them, there is no backwards communication from gstreamer, it's just a stream.

The record shows that suddenly no more video is listed as received from the client. The video arrives in the mirror_data_lport which is the first tcp port. Set it with option -p ( e.g. -p 50000 it will be tcp port 50000). You should use something like netstat to listen to this port and see if it stops receiving data when the screen freezes.

If the port still receives data, then one can add extra debug statements to see where it gets lost.

If if stops receiving data, its a problem with the client I suppose, unless uxplay needs to be sending something to the client. and stopped. I don't know the code, so would have to look at it.

fduncanh commented 2 years ago

the video is processed in lib/raop_rtp_mirror.c
function is raop_rtp_mirror_thread

fduncanh commented 2 years ago

summary so far;

  1. I can't reproduce the issue: a 90+ min youtube video streamed from a macbook to uxplay on my linux desktop works fine with no issues.
  2. when doing the same on your system, raop_mirror_thread stops receiving video packets for no apparent reason after about 20 mins. This means that the screen will continue to display the last video frame received (i.e. is frozen). Video frames should be received from the socket attached to the first tcp port, which can be set with the -p option.
  3. audio continues to arrive, get decrypted, and passed on to gstreamer for successful rendering.
  4. there is some messages sent by raop_ntp_thread to the client (on its port "timing_rport" mentioned early in the debug) that seem not to be answered (after video stops)

      uint64_t send_time = raop_ntp_get_local_time(raop_ntp);
        byteutils_put_ntp_timestamp(request, 24, send_time);
        int send_len = sendto(raop_ntp->tsock, (char *)request, sizeof(request), 0,
                              (struct sockaddr *) &raop_ntp->remote_saddr, raop_ntp->remote_saddr_len);
        logger_log(raop_ntp->logger, LOGGER_DEBUG, "raop_ntp send_len = %d", send_len);
        if (send_len < 0) {
            logger_log(raop_ntp->logger, LOGGER_ERR, "raop_ntp error sending request");
        } else {
            // Read response
            response_len = recvfrom(raop_ntp->tsock, (char *)response, sizeof(response), 0,
                                    (struct sockaddr *) &raop_ntp->remote_saddr, &raop_ntp->remote_saddr_len);
            if (response_len < 0) {
                logger_log(raop_ntp->logger, LOGGER_ERR, "raop_ntp receive timeout");
            } else {
                logger_log(raop_ntp->logger, LOGGER_DEBUG, "raop_ntp receive time type_t packetlen = %d", response_len);
    
                int64_t t3 = (int64_t) raop_ntp_get_local_time(raop_ntp);

the outgoing messages back to client can be listened to on timing_rport (set by client, probably udp, but I dont know for sure)

Client identified as User-Agent: AirPlay/600.8.1
16 byte aeskey after sha-256 hash with ecdh_secret:
d9 2c 3a 42 de ea fd f3 e9 58 ad 94 ff 81 da e0 

timing_rport = 50381
raop_ntp parse remote ip = 192.168.1.4
raop_ntp starting time
raop_ntp local timing port socket 32 port UDP 34370
raop_rtp parse remote ip = 192.168.1.4
raop_rtp_mirror parse remote ip = 192.168.1.4
eport = 37145, tport = 34370
fduncanh commented 2 years ago

but looking at your video-only earlier debug the raop_ntp timing dialog between server and client is continuing normally after the video stream abruptly stops. This does look like a client problem not a uxplay problem (?) its not a network problem, because audio keeps running after video stops. Definitely you need to use netstat to see whether the client is still sending video to uxplay on the server's mirror_data_lport (port tcp n with the -p n option or tcp 7011 with just -p)

raop_rtp_mirror video ntp = 1643093902966894, now = 1643093902986372, latency = 19478
raop_rtp_mirror video ntp = 1643093903000227, now = 1643093902986427, latency = -13800
raop_rtp_mirror video ntp = 1643093903033561, now = 1643093902988139, latency = -45422
raop_rtp_mirror video ntp = 1643093903066894, now = 1643093903022611, latency = -44283
raop_rtp_mirror video ntp = 1643093903100227, now = 1643093903056207, latency = -44020
raop_rtp_mirror video ntp = 1643093903133561, now = 1643093903088913, latency = -44648
raop_rtp_mirror video ntp = 1643093903166894, now = 1643093903121580, latency = -45314
httpd receiving on socket 29
conn_request

POST /feedback RTSP/1.0    <--------------------client asks server if it is still alive
CSeq: 3570
DACP-ID: 80925C4CA6A9B3F6
Active-Remote: 882162070
User-Agent: AirPlay/600.8.1

Handling request POST with URL /feedback
raop_handler_feedback

RTSP/1.0 200 OK    <-----------------------client tells server  yes, I'm alive
CSeq: 3570 
Server: AirTunes/220.68 

raop_ntp send_len = 32              <-----------------client asks server for its time
raop_ntp receive time type_t packetlen = 32    <----------server did receive an answer from client
raop_ntp sync correction = 0
httpd receiving on socket 29
conn_request

POST /feedback RTSP/1.0                <---------client asks server (again!) if it is alive (Why?, didnt it get the message yes?)
CSeq: 3571
DACP-ID: 80925C4CA6A9B3F6
Active-Remote: 882162070
User-Agent: AirPlay/600.8.1

Handling request POST with URL /feedback
raop_handler_feedback

perhaps the client stops sending video because it didnt hear the "yes I'm alive" reply. but why does audio continue? (Maybe because this is a protocol originally based on adding video to itunes, and the audio and video parts of the protocol are not consistent for historical reasons (?))

In any case, since I dont have your problem, its connected to the difference between your and my systems. I'm using an older macbook pro that only runs catalina. I can't test using Monterrey at the moment.

k0ste commented 2 years ago

As you suggested I tested mirror from another device - iPhone 13 Pro (iOS 15.2.1) Youtube plays ~40 minutes without problems, until

httpd receiving on socket 29
conn_request

POST /feedback RTSP/1.0
CSeq: 1207
DACP-ID: 1BEEB5F557FD3142
Active-Remote: 2777318997
User-Agent: AirPlay/600.8.1

Handling request POST with URL /feedback
raop_handler_feedback

RTSP/1.0 200 OK
CSeq: 1207
Server: AirTunes/220.68

raop_ntp send_len = 32
raop_ntp receive timeout

I was check Linux network for errors

2: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DORMANT group default qlen 1000
    link/ether 9c:b6:d0:95:a7:1d brd ff:ff:ff:ff:ff:ff
    RX:   bytes  packets errors dropped  missed   mcast
    13502586089 10138852      0       0       0       0
    TX:   bytes  packets errors dropped carrier collsns
      463906548  1381759      0       0       0       0

And network adapter statistics

NIC statistics:
     rx_packets: 5981444
     rx_bytes: 8195370822
     rx_duplicates: 0
     rx_fragments: 5955089
     rx_dropped: 48
     tx_packets: 513915
     tx_bytes: 44320716
     tx_filtered: 0
     tx_retry_failed: 0
     tx_retries: 0
     sta_state: 4
     txrate: 300000000
     rxrate: 243000000
     signal: 206
     channel: 5180
     noise: 159
     ch_time: 145
     ch_time_busy: 17
     ch_time_ext_busy: 18446744073709551615
     ch_time_rx: 18446744073709551615
     ch_time_tx: 18446744073709551615
     tx_pkts_nic: 209659
     tx_bytes_nic: 0
     rx_pkts_nic: 6186779
     rx_bytes_nic: 0
     d_noise_floor: 18446744073709551517
     d_cycle_count: 2481799346
     d_phy_error: 7
     d_rts_bad: 0
     d_rts_good: 0
     d_tx_power: 25
     d_rx_crc_err: 0
     d_rx_crc_err_drop: 0
     d_no_beacon: 0
     d_tx_mpdus_queued: 513941
     d_tx_msdu_queued: 513941
     d_tx_msdu_dropped: 0
     d_local_enqued: 28
     d_local_freed: 28
     d_tx_ppdu_hw_queued: 209659
     d_tx_ppdu_reaped: 209659
     d_tx_fifo_underrun: 0
     d_tx_ppdu_abort: 0
     d_tx_mpdu_requeued: 50786
     d_tx_excessive_retries: 15586
     d_tx_hw_rate: 3
     d_tx_dropped_sw_retries: 0
     d_tx_illegal_rate: 0
     d_tx_continuous_xretries: 0
     d_tx_timeout: 0
     d_tx_mpdu_txop_limit: 0
     d_pdev_resets: 2
     d_rx_mid_ppdu_route_change: 817
     d_rx_status: 6218083
     d_rx_extra_frags_ring0: 0
     d_rx_extra_frags_ring1: 35
     d_rx_extra_frags_ring2: 109
     d_rx_extra_frags_ring3: 0
     d_rx_msdu_htt: 6186860
     d_rx_mpdu_htt: 6186779
     d_rx_msdu_stack: 26432
     d_rx_mpdu_stack: 26432
     d_rx_phy_err: 0
     d_rx_phy_err_drops: 4
     d_rx_mpdu_errors: 197417
     d_fw_crash_count: 0
     d_fw_warm_reset_count: 12
     d_fw_cold_reset_count: 7

Not much drops for Wi-Fi I think

What's next, record tcpdumps from Mac and Linux simultaneously?

This does look like a client problem not a uxplay problem

I think so too

Definitely you need to use netstat to see whether the client is still sending video to uxplay on the server's

I will test this tomorrow, when I was streamed from iPhone the RX on Linux interface was around 30Mbit/s

fduncanh commented 2 years ago

SUMMARY so far

The symptom is that suddenly after streaming for about 20mins no more video packets seem to arrive at the server's first tcp port n set using -p n. Audio packets continue to arrive at a different (udp) port and are rendered.

Check to see if in fact the data stream at that tcp port was flowing in before, and has now stopped.

Further, I can only test on macOS 10.15.7, I dont know it macOS 11.x has changed anything about the AirPlay client, (It includes a native AirPlay server for example). But if you are seeing the problem on an iOS client, that's different. I cannot reproduce the problem, all works fine for me, youtube videos of more that 90 mins length show perfectly.

fduncanh commented 2 years ago

looking at your video-only debug data uxplay-git.log (which is easier to look at without the audio reports) I see many intermittent occasions where the "heartbeat" POST/feedback request dialog is repeated a number of times before the video stream restarts. (for example, as early as at line 778). This seems to be because the client fails to hear the server's OK response and repeats the POST/feedback until it eventually hears it and restarts streaming video.

Typically the OK is eventually heard and video starts again. Finally it doesnt (or you stop it before it restarts).

I never see such repetition of the dialog in a debug trace on my system.

It would seem that video (but not audio) streaming is paused by the client until an OK response is heard. I would guess your issue is an intermittent network problem that prevents the client hearing the response (?)

The action takes place at line 303 in lib/httpd.c which calls conn_request in lib/raop.c (which includes lib/raop_handlers.h, which is realy a part of raop.c, not a typical include file) this prepares the "OK" response.

The "OK" response is sent to the client at line 318 in httpd.c. I'm guessing it goes out on the second tcp port (n+1 for uxplay -p n)

k0ste commented 2 years ago

When timeout received video stops streaming, but audio continues Screenshot-1

Also recored pcac's on n+1 port, with heartbeats /feedback uxplay_pcaps.zip

Also, today Apple rolled MacOS 12.2, I was noticed that in last test with this version reconnect was performed on first timeout occurence

[k0ste@WorkStation ~]$ uxplay -s 3840x2160 -vs "waylandsink fullscreen=true" -fps 60 -p 8888
using system MAC address 9c:b6:d0:95:a7:1d
Initialized server socket(s)
Accepted IPv4 client on socket 29
Local: 192.168.1.11
Remote: 192.168.1.4
Open connections: 1
Client identified as User-Agent: AirPlay/605.1
Accepted IPv4 client on socket 31
Local: 192.168.1.11
Remote: 192.168.1.4
Open connections: 2
raop_rtp_mirror starting mirroring
ct=8 spf=480 usingScreen=1 isMedia=1  audioFormat=0x1000000
start audio connection, format AAC-ELD 44100/2
raop_rtp starting audio
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout

< here is reconnect and video continue working after some freeze>

raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_rtp_mirror error in header recv: 104   <------ after this, no more packets on both sides on port 8888
fduncanh commented 2 years ago

I dont know how to open/read the pcaps file does the "OK" heartbeat response make it back to the client?

Is the issue over now? Any diagnosis of what was happening?

fduncanh commented 2 years ago

Looking at the debug output, one sees that

So both client and server are watching each other to see if they are still there. At least for a while the client returns the timestamp even when video streaming is paused (and audio still continues to stream).

k0ste commented 2 years ago

I dont know how to open/read the pcaps file

This possible with wireshark The heartbeats in this case is a TCP

image
fduncanh commented 2 years ago

So it's still a mystery why the macOS 12.x client (but also an iOS client you tested ?) stopped streaming, but are you saying the error is gone after an update to macOS 12.2?

I was assuming the repeated heartbeat requests during periods with no video streaming were because the response was not being heard by the client. But in fact the debug traces show that the heartbeat requests by the client are still being sent at exactly 2 second intervals, so nothing is wrong with the heartbeats. So this is not in fact an indicator of the problem. The client remains in contact with the server, and knows it is alive.

The other ntp signal will be udp, and the server uses local "timing_lport" which is the first udp port n if -p n is used, and the client uses remote timing_rport, which is found in the first SETUP request from the client

Client identified as User-Agent: AirPlay/600.8.1
16 byte aeskey after sha-256 hash with ecdh_secret:
c5 f2 e8 f2 0d 07 85 1e a9 ed 1a 85 e8 5f 6f 38 

timing_rport = 53376   <----------------------------------------------udp port on client for receiving and  responding to ntp requests
raop_ntp parse remote ip = 192.168.1.4
raop_ntp starting time
raop_ntp local timing port socket 30 port UDP 58344   <------------------udp port on server for sending and receiving ntp signals.
raop_rtp parse remote ip = 192.168.1.4
raop_rtp_mirror parse remote ip = 192.168.1.4 
eport = 39191, tport = 58344  <------------------------eport is server TCP port for RTSP communications (

So it seems purely a client issue why video streaming stops (?)

fduncanh commented 2 years ago

OK I've replicated your problem when mirroring a macbook (unibody) running Catalina, with no activity on the macbook screen, using "use as a separate display" option for the mirror window.

After a period (maybe 20 mins, but the debug output wasn't saved), the server's ntp request for the client's local time was not being replied to within the timeout.

At least I'll be able to try to see why it happens (screen inactivity?) and whether it eventually recovers.

If its unrecoverable, I'll make raop drop the connection after e.g 20 ntp timeouts (20 x 3 secs = 1 min of failure).

I need to see if the ntp timeout is ever recovered from (earlier halts to video streaming in your debug logs were recovered from provided the ntp communication was working with no timeouts)

If one needs to wait 20 mins for failures, its difficult to debug!

(But If all the issues take place on the macOS client, there's nothing that UxPlay can do, except drop the connection...)

k0ste commented 2 years ago

Good to hear that is reproducible. Maybe this is some power saving. Is I was say before, on Linux I was disabling WiFi power saver for exclude influence like increasing latency or packet drops, via iw dev wlan0 set power_save off

fduncanh commented 2 years ago

I'm still trying to reproduce it cleanly. The issue is to see whether any recovery after an ntp timeout is ever possible. I would maybe allow 10 timeouts (30 sec) or 5 timeouts (15 sec) before closing the connection, with a console message. making reconnection (manually) by the client possible.

fduncanh commented 2 years ago

In the end I did not mange to reproduce it a second time.

Can you still reproduce the problem? Do you have any suggestions?

k0ste commented 2 years ago

I think the connection drop is a good idea, I can perform tests, and we will know help this trick or not

fduncanh commented 2 years ago

@k0ste Please get me a debug trace showing the problem, using the latest github master code. I found that UxPlay is dropping video frames with "05" as the fifth entry in the video header.
These are rare in the usual streaming I see. but maybe your client is streaming only these "05" frames when it "freezes"?

In that case I can probably fix it, as the "dsafa22" code that RPiPlay was taken from has the code for handling "05" frames. (These may be "resent" frames) Lets hope this is the issue: It would be another rpiplay bug I have fixed!

k0ste commented 2 years ago

Log with build from last commit uxplay.log.zip

Seems sometimes 05 header is present:

3607 05
√ Downloads % ack 'video packet header' | awk '{print $8}' | sort | uniq -c
129420 00
   2 01
3607 05    <-------------------
fduncanh commented 2 years ago

Yes but it is unfortunately not the cause of your problem. The dropped "05" video packets seem to be a minor issue that I may try to fix, or just leave unfixed.

So what I see in this latest debug is

So it is unexplained why the client suddenly stops sending video packets, and heartbeat requests, and stops replying to the servers timestamp requests (at least within a 0.3 sec time out window). After it restarts the heartbeat /timestamp dialog, the video frames don't restart (except perhaps for one bad TCP message.)

I was wondering if the client was sending '05' video that was being ignored by uxplay, but this is not the case.

fduncanh commented 2 years ago

AHA!!!. That one bad video packet after the dialog seems to shut down the raop_mirror TCP thread !

This is the servers first TCP port for incoming video data.

The heatbeat is received on the servers's second TCP port and the ntp timestamps are received on the first UDP port.

So (????) if the TCP thread for video hadn't been exited from because a bad pcket was detected, maybe the video would probably have restarted after some issue (with either the client or the network) that temporarily blocked signals from the client reaching the server was over!

fduncanh commented 2 years ago

@k0ste

Please try again; maybe its fixed? (I now allow one "bad" video frame before exiting)

fduncanh commented 2 years ago

Its possible I will also need to flush the bad frame but lets see if this works

fduncanh commented 2 years ago

unfortunately the error 104 is recv errno 104 (connection reset by peer) so my fix wont work, and I'll need to trigger reopening the socket to restart the connection, but this should not be difficult now its clear what is happening. . EDIT I'll revert the change and let you know when the revised fix is posted

fduncanh commented 2 years ago

@k0ste

Please test again with latest github. Dont need to use -d.

This is now a detection of ECONNRESET error. No fix so far, just an error message. Please see if the screen freezing issue triggers the error message.

k0ste commented 2 years ago

When screen actually freeze (work's fine for 2 or 3h), the ECONNRESET catched

using system MAC address 9c:b6:d0:95:a7:1d
Initialized server socket(s)
Accepted IPv4 client on socket 29
Local: 192.168.1.11
Remote: 192.168.1.4
Open connections: 1
Client identified as User-Agent: AirPlay/605.1
Accepted IPv4 client on socket 31
Local: 192.168.1.11
Remote: 192.168.1.4
Open connections: 2
raop_rtp_mirror starting mirroring
ct=8 spf=480 usingScreen=1 isMedia=1  audioFormat=0x1000000
start audio connection, format AAC-ELD 44100/2
raop_rtp starting audio
raop_ntp receive timeout
raop_ntp receive timeout
ct=8 spf=480 usingScreen=1 isMedia=1  audioFormat=0x1000000
raop_rtp starting audio
ct=8 spf=480 usingScreen=1 isMedia=1  audioFormat=0x1000000
raop_rtp starting audio
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_ntp receive timeout
raop_rtp_mirror error  in header recv: 104
***ERROR Exited raop_rtp_mirror thread because ECONNRESET was received
fduncanh commented 2 years ago

Yes the timeouts are some kind of network issue. This one lasted about 2 minutes. If they are short enough, the connection can recover, but not if something got blocked in the socket. ECONNRESET is only signaled when the system attempts to recover and something broken arrives at the socket. Still considering what to do as a fix. Using ECONNRESET should be fairly easy.

k0ste commented 2 years ago

This is not "the one timeout", this is bunch of timeouts in different periods of time As I said earlier, without a datetime (%a %b %d %H:%M:%S %Z %Y, for example) it is difficult to determine which time period the timeout belongs to, I think it is will be good to add support for timestamps to the logger configuration

fduncanh commented 2 years ago

@k0ste

I added a timestamp (still in unix time, microseconds after 1970-01-01) and a counter to the timeout. If I shutdown the wifi on the macbook while it is connected I provoke the timeouts. So clearly uxplay should give up after (say) 10 timeouts (10 x 3 = 30sec) and close the connection, with an option to change this number.

(As an aside, I followed up on those mysterious "type 05" video packets, and discovered that they are not video packets but binary plists sent by the client, reporting regularly on video performance such as FPS etc. I added them to debug output, and you can now get them in non-debug terminal output with the new option uxplay -FPSdata ).

fduncanh commented 2 years ago

@k0ste

I think I have fixed your issue, please test latest github.

I have tested the -reset option by stopping and restarting wifi from the mac client to trigger timeouts. I can't test the response to ECONNRESET error. Please test by using uxplay -reset 0 to see what happens when your "after ~ 20 mins" ECONNRESET error occurs (The error message will no longer say ECONNRESET, the message will be something like "TCP connection reset by peer") (uxplay -reset 0 switches of the timeout limit)

fduncanh commented 2 years ago

@k0ste Also, please give me your opinion about what the default value of n in -reset n should be. currently, I chose n = 10, (30 seconds)

If you run uxplay without the -d (debug) option, the terminal will show the timeout history clearly.

In your debug logs, sometimes your system seems to recover from the ntp timeouts without ECONNRESET . (ECONNRESET only happens if new video data is received from the client, but the receiving socket was corrupted when the signal previously got cut off, so if the client is completely gone, this will never happen)

In this case, the counter (should) go back to 0.

so n should not be too small, but not too large either.

fduncanh commented 2 years ago

@k0ste

any updates? Thinking about it, I think most people have a stable network connection, so maybe a reset after 2 consecutive timeouts (at least 3 second of interruption) might be the correct default, and those with timeout issues that self-correct after a few timeouts can use the -reset n option to allow more timeouts before a reset? what do you think?