puemos / hls-downloader

Web Extension for sniffing and downloading HTTP Live streams (HLS)
https://puemos.gitbook.io/hls-downloader/
MIT License
1.95k stars 241 forks source link

HARD: Chunks need (binary) checking and editing before assembling (saving)) #65

Open zb-z opened 4 years ago

zb-z commented 4 years ago

Describe the bug Saved video "looses synchronization" in VLC player.

This one may take a long time and some external help unless you already know .mp4 format inside out and have tools to check and diff.

To Reproduce

Note: You may need yo add this to uOrigin Options (to let only the content pass the gate):

! ALLOW: mcloud.to on putlockertv.to
@@||mcloud.to^$subdocument,domain=putlockertv.to

Goto: https://www7.putlockertv.to/watch/westworld.801pn/xr0lrvz -- Westworld 1

take MyCloud, episode 6 take 1280x720 save as: S02E06.mp4

Goto: http://www.tvsubtitles.net/tvshow-2081-1.html -- Westworld 1 take Season 2, 2x06 [ Phase Space ] take Westworld 2x06 (WEB.DEFLATE) rename to: S02E06.srt -- it's in a .zip

open S02E06.mp4 in VLC (D-click or drag or open explicitly)

-- It will "spin the yellow bar" - VLC does that when it's stuck trying to "open" or "parse" something

open S02E06.mp4 in Win 10's player (R-click, open with "Video & TV") -- It will play. Click little callout icon (next to speaker) to check that .srt is there. Select it tothat it's used. Drag slider to approx 4:00 to verify thet it is used [this means synchronized beyong a shaddow of a doubt]

rename S02E06.srt to foobar.srt open S02E06.mp4 in VLC again - it plays this time

Additional info

What happens here is that the presence/loading of an .srt makes VLC attempt a synchronization and it gets stuck => .mp4 is slightly damaged (I suspect that it has "endings") in each chunk - sometimes I can hear them as a very brief audio "disappearance" => the problem is (almost) always there, understanding the payer code (for the web page) might help (I never hear these pauses in the web page player => player knows what to skip and/or fill from the buffer - that "pause" destroys the buffer).

Either .ts chunks (from hls/720/720.m3u8) are not just binary slices or a few bytes gets added when serializing ("saving" to file).

Why does Win10 player survive this ? Because MS takes bugs from reality and VLC plays bigger catholic than Pope :-) It's a frequent occurrence that "open source/free" projects reject or deflect bugs that companies take and fix. The mere appearance of a broken player costs MS a lot, plus this is an interesting problem to dig in and fix by making a player more tolerant.

It can also be that Win10 player synchronizes subtitles by measuring time and VLC latches on to "timing track" and trusts it blindly to be intact. For just playing, a "break" of a few bytes means just destroyed (reconstruction) buffer (probably just audio buffer) and a barely noticeable glitch. For synchronizing "by codes" it means "broken stream, don't know what to do".

Other things

One question that may require some digging is whether Chrome provides facilities/API to do binary data reading and editing, (without having to convert everything to base64). You are dealing with about 1000 blocks and 0.5-1.5 GB of data "per hour" so the time to process becomes important.

Try to file a bug against VLC - here . I can't for the time being (pass recovery demands both username and e-mail, no way I can have any notes on a username from zillion lightyears ago :-) so till I find a human contact to unblock me I'm floating in limbo).

I checked against old VLC versions till 2.2.8 - 32 and 64 bit.

Try to ask these guys for some help with the stream/file format (unless you already know it) to nail the exact bytes (or pattern) that needs to be checked and removed (unless it's a fixed "ending" that can be just chopped of blindly). Expect huge resistance to a bug :-) You may want to start with "VLC breaks on videos that Windows Player plays perfectly" -- "your enemy has it" tactics :-)))

Don't expect much from free tools, "MP4 Inspector" creams "invalid file format" like a little girl on files that even VLC plays :-). I'm running qcli (QCTools) while writing this - hasn't died yet but is taking forever. Assuming that it works that's still just the beginning - you'd have to nail the exact frame that end one .ts and start the next in order to have anything interesting to see.

Note Tried hlsloader against the same page (it has a lot of problems discovering the .ts list on these pages but it's "Capture" mode helps) with S01E05 and it same over OK (== VLC with .srt didn't turn yellow) => snoop in, check out what it's doing :-) That also means that you can grab the same file (say that same S01E05) via both and do a binary diff - that might be the fastest way to zero in on the problem.

jwshields commented 4 years ago

I believe I may be running into similar things with MPC-HC here. Videos downloaded by the extension, from Twitter, or other websites, tend to freeze the video stream for about 1-2 seconds while the audio plays if I seek in the player. Occasionally audo desync does happen, but mostly the frozen video is what I experience.

I've been able to work around it externally using ffmpeg though. Running videos through it seems to create a functional file by rebuilding the mp4 container. ffmpeg.exe -err_detect ignore_err -i .\video_filename.mp4 -c copy .\video_filename.fixed.mp4

zb-z commented 4 years ago

Any chance you can check one of these rebuilt mp4-s in vlc with a .srt "attached" ? Doesn't have to be the real .srt for a particular video - just any .srt with timestamps beyond the length of the video removed (no need to be poking into bugs further than we need to :-). Say if the video is 3m long tha last .srt timestamp shouldn't go beyond say: 00:02:55,084 --> 00:02:57,040 . And then the same in Win-10's player.

The reason I ask is that while I do blame the downloader (since I'm able to get the same thing from another extension without let's say "gaps") I'm also trying to make a case against VLC (filing a bug "there" is infinitely frustrating "experience") since Win-10's player is direct proof that these gaps area minor glitch that's very easy to sail over - with a bit of good will of course :-).

BTW, isn't VLC supposed to be using that "same" ffmpeg (as a linked library)? I semi-remember reading something about that somewhere.

jwshields commented 4 years ago

Hello, I apologize for the lag in responding.
I was unable to download the video that you've linked.
I tested with an SRT of my own, along with a video I have saved
I tested with a number of media players, and am putting my results below.

I hope this answers your questions, or at least helps-

Broken MP4 + SRT:

"Fixed" MP4 + SRT:

Broken MP4:

"Fixed" MP4:

zb-z commented 4 years ago

Add this to your uBlock filters:

! ALLOW: mcloud.to on putlockertv.to @@||mcloud.to^$subdocument,domain=putlockertv.to

I completely forgot to mention that part - I copy&paste uBlock configs the sec I install it everywhere (and tow an old portable Chrome around) so that possible issue hasn't been on my mind for something like 2-3 yrs. If that doesn't help then go to Starbucks - with laptop of course :-)

What you wrote sounds like your file(s) have more severe damage/problem than what I came across. I suspect that it could be as small as a "code" at the end of each chunk that "closes" the chunk and should be skipped/stripped when writing bytes to a file. There could also be N junk bytes after the closing code and Twitter might be more generous with junk :-)). Would need a very good (and compact) definition of .mp4 format and a util to be able to align "delimiter/marker" bytes fast (without falling asleep 100 times :-)

I semi-suspect that ffmpeg might be going overboard and assuming "real error" without thinking that it might be just an extra marker/code that's "usually not expected". There are a few assumptions that would have to be checked. Bigger problem is how to file a bug against VLC - and not be summarily dismissed (with "we are not obliged to play damaged files"). The whole problem is entirely HLS related, thereby quite recent.

Which is not to say that hls-downloader shouldn't try to fix itself, if possible :-), but if WMP plays it there's no excuse for a "much better player" not to play it. I have pretty good idea why VLC trips and WMP doesn't.

Stream Recorder extension (iogidnfllpdhagebkblkgbfijkbkjdmm) doesn't have the problem but it could be that it uses entirely different tactics which might be between hard and impossible to replicate in a different extension. Don't know if Chrome has API for byte peek&poke-ing (taking a roundabout path via base64 would be prohibitively expensive.)