ATV hangs during DirectPlay

moodyblue commented 9 years ago

When ATV plays a movie in DirectPlay mode, sometimes, and at random places, it enters a perpetual buffering state. Pressing menu in the ATV remote gets the ATV out of the buffering state, but the ATV is no longer able to play any other movie. An extensive description of the situation can be found in this thread -> https://forums.plex.tv/index.php/topic/145463-atv-enters-perpetual-buffering-during-video/

More tests showed that following paragraphs are not true:

I have found that, if an external machine is forcing PlexConnect to be active, even though a buffering condition may still appear, the ATV is no longer in a "hang state", because it can resume playing and/or play any other movie -> https://forums.plex.tv/index.php/topic/145463-atv-enters-perpetual-buffering-during-video/?p=924692

Would it be possible to implement a "keep alive" function to be used by PlexConnect whenever a film is being played in DirectPlay mode ?

Fredde87 commented 9 years ago

Since the Plex forum is down I'll post here. Having the exact same problem as described in the thread posted above. My library consists of mostly 720p iTunes DRM protected content.

I've been analyzing the network usage on my Plex+PlexConnect server and noticed extreme high usage. The test movie I am using is 3.7gb in size. However once I hit play on the movie I see my network utilization between 50-90mbit/s (only got a 100mbit connection to my Apple TV).

How is this even possible? I can see Plex sending up to 5GB to the ATV before the buffer issue starts, usually about 5-15 min into the movie. How can it transfer so much data when the movie is not even that big in size? It can't be transcoding because its DRM protected, so it should serve the file as it is right?

So I am thinking the movie is maybe not being sliced correctly into chunks? Which eventually fills up the buffer/cache/memory/space on the ATV?

Anyone else seeing the same thing?

iBaa commented 9 years ago

Anyone else seeing the same thing?

Yes... that's what the forum's thread is all about.

Fredde87 commented 9 years ago

I meant more specifically are you seeing extreme amounts of data being passed between your ATV and Plex for a small 3.6gb movie?

Using nload to view the network usage, but starting to think maybe its not that accurate. Grabbed a tcpdump yesterday and had a look and I can't see any irrelevant data. The last request from the ATV was sent to it, but it never requests the next part when the problem occurs. Had a few RST's being sent from the ATV a second before, not sure if it is relevant or why the ATV is resetting the connection

moodyblue commented 9 years ago

It's normal to have a peak when the film starts playing. Maybe your tools are "confusing" bits with bytes (bytes have 8 bits). I was also thinking to trace the network flow but I have no skills to analyze the data. I am also convinced that the problem is related with the protocol implementation on the PMS side. Maybe Baa could give us a contact in the Plex team to help debug the situation.

moodyblue commented 9 years ago

Proposed solution -> change application.js and utils.js so that every OPEN statement has false in the 3rd parameter.

roidy commented 9 years ago

Changes committed to git and hopefully fixed now :)

Fredde87 commented 9 years ago

Unfortunately I am still seeing these issues. The latest git code has not solved the issue for me. I left 8 Apple TV's (gen 3) running last night playing the Paddington movie (720p iTunes purchased DRM) and two of those eight ATV's stopped and entered this buffering state where the only solution was to restart them.

I've looked at the application.js file and there is still two req.open's that have true as it's third parameter. Is this a mistake?

moodyblue commented 9 years ago

Are you able to perform the same test with 8 ATV's using iTunes as the media server ? I also have a file that creates the buffering problem in the ATV when PMS is the media server, even after this patch to PlexConnect. But this file also triggers the buffering situation when iTunes is the media server. This is an indication that there is also a problem within the ATV itself. I would report the problem to Apple if I knew how to do it...

Regarding the other req.open's, you can modify them as well, it would not hurt, but probably you would not solve the problem because the only req.open that is executed during video playback is the logging one. The others are executed when you navigate accross PlexConnect panels.

2015-11-07 20:11 GMT+00:00 Fredde87 notifications@github.com:

Unfortunately I am still seeing these issues. The latest git code has not solved the issue for me. I left 8 Apple TV's (gen 3) running last night playing the Paddington movie (720p iTunes purchased DRM) and two of those eight ATV's stopped and entered this buffering state where the only solution was to restart them.

I've looked at the application.js file and there is still two req.open's that have true as it's third parameter. Is this a mistake?

— Reply to this email directly or view it on GitHub https://github.com/iBaa/PlexConnect/issues/344#issuecomment-154744213.

iBaa commented 9 years ago

Well, if you look at network communication using XMLHttpRequest as the culprit, the most stressed one while playing media is called from playerTimeDidChange - every 5sec to notify PMS of the current watch progress, every 1minute to poke the transcoder (in case you are actually transcoding). Logging itself is done rather rarely.

Since it looks like the theory is that aTV/iOS crashes based on internal overflow or an shortage on http channels, one could try to modify the report times. Like decrease the period to make it worse, stretch the time to cover the issue. Please, could somebody seeing this issue set the time difference in application.js/line 60 to 500ms or even 50ms? That should trigger the fault pretty quickly.

I still fail to understand the effect of that falseto true change, JS is not doing anything with the incoming results anyway. The only thing I could think of is that iOS somehow has an issue with garbage collection and/or cleaning up those dangling XMLHttpRequests...

Fredde87 commented 9 years ago

There is a 5 ATV limit to Home Sharing but I can give it a go!

On Sat, Nov 7, 2015 at 7:55 PM, moodyblue notifications@github.com wrote:

Are you able to perform the same test with 8 ATV's using iTunes as the media server ? I also have a file that creates the buffering problem in the ATV when PMS is the media server, even after this patch to PlexConnect. But this file also triggers the buffering situation when iTunes is the media server. This is an indication that there is also a problem within the ATV itself. I would report the problem to Apple if I knew how to do it...

Regarding the other req.open's, you can modify them as well, it would not hurt, but probably you would not solve the problem because the only req.open that is executed during video playback is the logging one. The others are executed when you navigate accross PlexConnect panels.

2015-11-07 20:11 GMT+00:00 Fredde87 notifications@github.com:

Unfortunately I am still seeing these issues. The latest git code has not solved the issue for me. I left 8 Apple TV's (gen 3) running last night playing the Paddington movie (720p iTunes purchased DRM) and two of those eight ATV's stopped and entered this buffering state where the only solution was to restart them.

I've looked at the application.js file and there is still two req.open's that have true as it's third parameter. Is this a mistake?

— Reply to this email directly or view it on GitHub https://github.com/iBaa/PlexConnect/issues/344#issuecomment-154744213.

— Reply to this email directly or view it on GitHub https://github.com/iBaa/PlexConnect/issues/344#issuecomment-154761243.

Fredde87 commented 9 years ago

I will test this now, I've changed it to 50ms

Edit: I've tested this now and this made no difference, it did not increase the likelyhood of it running (I played it on 4 Apple TVs and none of them crashed).

By the way, why do we think this is due to http channel shortage? Isn't that the one thing we know that works? Once the problem occurs, there is no issue for the ATV to download XML files, I can still open Apps and get the menu's etc. It is media (thumbnails, video streams) etc that does not work anymore. What I am trying to say, if it was a shortage of http channels then we wouldnt be able to download the XML files needed for the menu structures?

On Sun, Nov 8, 2015 at 5:38 AM, iBaa notifications@github.com wrote:

Well, if you look at network communication using XMLHttpRequest as the culprit, the most stressed one while playing media is called from playerTimeDidChange - every 5sec to notify PMS of the current watch progress, every 1minute to poke the transcoder (in case you are actually transcoding). Logging itself is done rather rarely.

Since it looks like the theory is that aTV/iOS crashes based on internal overflow or an shortage on http channels, one could try to modify the report times. Like decrease the period to make it worse, stretch the time to cover the issue. Please, could somebody seeing this issue set the time difference in application.js/line 60 to 500ms or even 50ms? That should trigger the fault pretty quickly.

I still fail to understand the effect of that falseto true change, JS is not doing anything with the incoming results anyway. The only thing I could think of is that iOS somehow has an issue with garbage collection and/or cleaning up those dangling XMLHttpRequests...

— Reply to this email directly or view it on GitHub https://github.com/iBaa/PlexConnect/issues/344#issuecomment-154789518.

moodyblue commented 9 years ago

IMHO the problem is caused by not serializing the requests between ATV and PMS. I can make another test with PMS verbose and debug log to confirm my theory, but I have observed that, before the patch was implemented, PMS had several sessions in parallel to send "video slices" to the ATV. The LAN utilization was high in the beggining, and was steady afterwards, feeding the ATV constantly with data, probably faster than the ATV could accept in its internal buffer. Also, there were no "periods of silence" in the LAN.

After the patch, the LAN utilization became very erratic, with some "periods of silence". It looks like the ATV got control of the communications and asks "video slices" only when needed, thus avoiding internal buffer overrun, and consequently the "perpetual buffering" situation.

Having said so, I do not have test files to capable of triggering the problem after the patch. Well, in fact I have one, but this file also triggers the problem in iTunes, so I think that it cannot be used. Therefore I am not 100% sure that the patch solves the problem. It seems to solve in a very large number of situations.

@Fredde87, you are free to test with req.open set to "false" and report if this makes a difference in your case. If changing other req.open statements solves your problem I'm am sure that Baa / Roidy will be happy to incorporate that as a formal patch.

Fredde87 commented 9 years ago

@moodyblue, I've now played the same movie via Home Sharing 11 times and it didn't crash once. So the problem seems to be unique to PlexConnect. Think we are back to where we were before the patch :(

On Mon, Nov 9, 2015 at 5:27 AM, moodyblue notifications@github.com wrote:

IMHO the problem is caused by not serializing the requests between ATV and PMS. I can make another test with PMS verbose and debug log to confirm my theory, but I have observed that, before the patch was implemented, PMS had several sessions in parallel to send streams to the ATV. The LAN utilization was high in the beggining, and was steady afterwards, feeding the ATV constantly with data, probably faster than the ATV could accept it in its internal buffer. Also, there were no "periods of silence" in the LAN.

After the patch, the LAN utilization became very erratic, with some "periods of silence". It looks like the ATV got control of the communications and asks "video slices" only when needed, thus avoiding internal buffer overrun, and consequently the "perpetual buffering" situation.

Having said so, I do not have test files to capable of triggering the problem in after the patch. Well, in fact I have one, but this file also triggers the problem in iTunes, so I think I cannot use it for a comparison. Therefore I am not 100% sure that the patch solves the problem. It seems to solve in a very large number of situations.

@Fredde87 https://github.com/Fredde87, you are free to test with req.open set to "false" and report if this makes a difference in your case. If changing other req.open statements solves your problem I'm am sure that Baa / Roidy will be happy to incorporate that as a formal patch.

— Reply to this email directly or view it on GitHub https://github.com/iBaa/PlexConnect/issues/344#issuecomment-154995503.

moodyblue commented 9 years ago

@Fredde87, the difficulty here is that neither baa nor roidy were able to recreate the problem. baa may have seen it once, but apparently it did not occur again. And most people have never seen it as well. This situation is extremely hard to diagnose.

During this last year we've been "in circles". Reducing the MTU, pinging PlexConnect, deactivating dolby digital, etc., these actions seemed to alleviate the problem but have never solved it. The same with req.open, the main difference is that, apparently, now only you are able to recreate the problem. I really don't know how I can help you.

Fredde87 commented 9 years ago

Could maybe your or Baa purchase the Paddington movie in 720p in iTunes and put it in your Plex library? Happy to cover the cost and send you money for it over PayPal. See if you get the same problem to appear?

On 10 Nov 2015, at 21:13, moodyblue notifications@github.com wrote:

@Fredde87, the difficulty here is that neither baa nor roidy were able to recreate the problem. baa may have seen it once, but apparently it did not occur again. And most people have never seen it as well. This situation is extremely hard to diagnose.

During this last year we've been "in circles". Reducing the MTU, pinging PlexConnect, deactivating dolby digital, etc., these actions seemed to alleviate the problem but have never solved it. The same with req.open, the main difference is that, apparently, now only you are able to recreate the problem. I really don't know how I can help you.

— Reply to this email directly or view it on GitHub.

moodyblue commented 9 years ago

Thank you for your offer, but I have no technical skills to debug this situation. It took me almost one year, with trial and error, to discover that coding false in the req.open statement had a positive impact. I was lucky to discover it, but I cannot afford to continue testing.

Baa and Roidy are the developers of this product.

iBaa / PlexConnect

ATV hangs during DirectPlay #344