user234683 / youtube-local

browser-based client for watching Youtube anonymously and with greater page performance
GNU Affero General Public License v3.0
504 stars 63 forks source link

add videojs and qualitySelector #70

Closed zrose584 closed 3 years ago

zrose584 commented 3 years ago

Some issues I noticed:

Before being able to use videojs, one currently has to run ./server.py init-js, for downloading the the min.js files

user234683 commented 3 years ago

Is the cloudtube approach you adapted feasible? It seems like using javascript to keep an <audio> and <video> would be very fragile. I imagine it would work in 90% of cases, when network conditions are normal. But during congestion or for people with slow/unusual connections, it seems like it wouldn't work. But if it does work, that would be great because it's much less code than the alternatives.

omarroth (invidious dev) discussed it here and said he used a DASH library as a way to get the necessary buffering logic to mux the video and audio.

It turns out handling all of that buffering is just inherently complex, so an implementation with Media Source Extensions would be very large. I found the Firefox implementation is at least 5580 lines of C++: MediaDecoder.cpp and MediaDecoderStateMachine.cpp, which probably doesn't include additional helper libraries for downloading files and dealing with network troubles. So it's more understandable that hls.js is 21k lines of code, given it also has to deal with additional livestream complexity.

zrose584 commented 3 years ago

Is the cloudtube approach you adapted feasible?

In theory one can detect that a track (audio/video) paused (due to lack of data) by using the wait event. So if e.g. the video track lacks new data, this way one can pause the audio until the playing event fires, at which point one resumes the audio playback. If this also works in practice remains to be seen. The (few) tests I did looked promising.

However, once we want to support live-streams we have to use DASH or HLS anyway, right?

I found the Firefox implementation is at least 5580 lines of C++: MediaDecoder.cpp and MediaDecoderStateMachine.cpp, which probably doesn't include additional helper libraries for downloading files and dealing with network troubles. So it's more understandable that hls.js is 21k lines of code, given it also has to deal with additional livestream complexity.

Hmm.. But isn't the point of Media Source Extensions to provide an high-level API for libraries like hls.js? I also wonder what hls.js is doing that leads to 21k loc. It should "just" download the media-chunks and feed it into MediaSource, right? Why does it have to parse mp4/aac, can't MediaSource do that already?

In their Readme they also list many additional features, like Timed Metadata, AES-128 decryption, captions, Adaptive streaming .. Maybe this is where the 21k loc come from, and not the "livestream complexity"?

user234683 commented 3 years ago

However, once we want to support live-streams we have to use DASH or HLS anyway, right?

Right, but we would only have to load hls.js for livestreams. If we could mux without hls.js, then for e.g. someone with a slow browser and internet, they could watch at 240p for most videos without slowing the browser.

Hmm.. But isn't the point of Media Source Extensions to provide an high-level API for libraries like hls.js? I also wonder what hls.js is doing that leads to 21k loc. It should "just" download the media-chunks and feed it into MediaSource, right? Why does it have to parse mp4/aac, can't MediaSource do that already?

That's a good question. I'm still not sure what's going on inside the files, but I did compare some different hls/dash implementations:

Dash

HLS

As seen from hasplayer.js, the HLS-specific parts are only 2 kloc. Similarly, in Google Shaka, the dash directory is 4 kloc and the hls is 3.6 kloc.

These libraries all seem to reimplement stuff like VTT parsing. Maybe the reason they're reimplementing browser functionality is that there's some browser out there (internet explorer?) that doesn't implement some APIs.

I also found this small dash MSE implementation which might give some insights into how an MSE implementation can work but I couldn't get it to run. And I think this might be one too. I have some other bookmarks/projects I found while searching for muxing solutions that might be useful, but I'm out of time so I'll post them later.

zrose584 commented 3 years ago

Right, but we would only have to load hls.js for livestreams. If we could mux without hls.js, then for e.g. someone with a slow browser and internet, they could watch at 240p for most videos without slowing the browser.

I think quality selection for livestreams will only work through hls.js. So we already have to quality-selector logic when using that. If we would also use HLS for normal videos, no additional logic would be required. So from a "less code = good" perspective, I think this would be preferable.

I see the cloudtube approach (adapted in this PR) more as a "stepping stone", which can later be upgraded to a HLS based solution.

Do you only want to use hls.js (or the like) for livestreams? If so why?

Maybe the reason they're reimplementing browser functionality is that there's some browser out there (internet explorer?) that doesn't implement some APIs.

Yet they all rely on Media Source Extensions, don't they? I think WebVTT has more support than mediasource. Another guess would be that Media Source Extensions is simply not flexible enough to allow "higher" features like encryption or adaptive streaming.

I also found this small dash MSE implementation [..] but I couldn't get it to run

My guess is that some packages had some breaking changes (in the last 6 years..). One could try removing the '^' Prefix from the version-strings in the package.json. But I am not sure if this is enough, as sub-dependencies are (afaik) not locked. That's what a package-lock.json file is for, which sadly that project doesn't use.

user234683 commented 3 years ago

I figured out the issue with that small dash implementation I linked (solution posted in original issue). The only other change I had to make was to change <script src="gulp-babel/node_modules/babel-core/browser.js"></script> to <script src="babel-core/browser.js"></script> in demo/index.html.

Some of the external mpd files don't work because of CORS or other problems (I think ommarath mentioned CORS problems for dash streams), but the "Akamai - Envivio Demo" under "Static, Fixed Length" works. You can seek to different spots in the video, audio syncs properly, buffering works, etc. So I think it will be possible to use this library. It might even be possible to ignore the dash parts of the library and rip out the buffering logic directly to make a specialized audio+video MediaSource solution, so no server-side pseudo-dash stream is necessary. Maybe we can make a gist of it afterwards too