instance01 / Twitch-HLS-AdBlock

Block advertisements that are inserted in Twitch streams directly.
MIT License
261 stars 25 forks source link

webRequest.filterResponseData instead of monkey patching #7

Open stoically opened 5 years ago

stoically commented 5 years ago

Thanks for your work on THA!

I was wondering, would it maybe be possible to use webRequest.filterResponseData on the m3u8 playlist urls and filter out the SCTE-35 flags from the response that way, instead of monkey patching? (Would of course only be an option for Firefox since Chrome can't modify response bodys yet) Sorry if that's an ignorant question and I'm missing the parts of what THA does besides filtering the flags.

stoically commented 5 years ago

FWIW, I've tried the following, but what happens is that the Ad starts playing and after 1-2 seconds it stops and only shows the buffering circle.

browser.webRequest.onBeforeRequest.addListener(details => {
  const filter = browser.webRequest.filterResponseData(details.requestId);
  const decoder = new TextDecoder('utf-8');
  const encoder = new TextEncoder();

  filter.ondata = event => {
    let str = decoder.decode(event.data, {stream: true});
    if (str.includes('#EXT-X-SCTE35-OUT')) {
      str = str.replace(/#EXT-X-SCTE35-OUT(.|\s)*#EXT-X-SCTE35-IN/gmi, '');
      str = str.replace(/#EXT-X-SCTE35-OUT(.|\s)*/gmi, '');
      str = str.replace(/#EXT-X-DISCONTINUITY/gi, '');
      str = str.replace(/#EXT-X-DATERANGE:ID="stitched-ad.*/gi, '');
    }
    filter.write(encoder.encode(str));
    filter.disconnect();
  };
}, {urls: ['https://*.hls.ttvnw.net/*.m3u8'], types: ['xmlhttprequest']}, ['blocking']);
instance01 commented 5 years ago

Very exciting stuff, I like this idea a lot. At least for Firefox it would make the code much cleaner. Especially if Chrome ever adds this functionality this would simplify things a lot, so I think this is a good general idea to which direction this extension should evolve to.

Regarding your code snippet, I think fixing the media sequence is still mandatory. If I recall correctly this is because we're watching the actual stream and increasing the media sequence of that stream, while Twitch thinks we're watching the ad stream. After the 30 secs of ads we get to the real stream and Twitch expects the media sequence to be where we left off 30 secs ago. If we don't reset it, the stream will start glitching badly because we're suddenly in the future according to Twitch.

I'll look at this a bit more closely when I get the time (or actually get some preroll ads again), or of course welcome any PR.

stoically commented 5 years ago

Glad you like it.

I think fixing the media sequence is still mandatory.

Could you elaborate on how fixing the sequence should work? Currently what happens, as I see it in the code (and limited tests with console.log):

So, is it expected to start at 1 again after an Ad, even for Midrolls? If that's the case, I guess it wouldn't work for multiple Ads since self._seq keeps its value.

instance01 commented 5 years ago

Honestly, I'm not sure anymore, especially with midroll ads. I've been testing on https://www.twitch.tv/twitchpresents yesterday and today, and it seems this way seems most stable. So we take the delta between the sequence number when ad starts and the sequence number when it stops and add it to the latest sequence number. This makes it so we only buffer like 10-15 seconds instead of the full 30 seconds. However I haven't had the chance to really test it with preroll ads yet.

I am sure that sequence numbers are important however. Subtract too much, you get constant buffering, add too much, the stream starts glitching heavily. When ads appear and we leave the sequence number as is afterwards, we get a lot of buffering.

Btw, the code didn't work for midroll ads, only for preroll ads, it was buggy. I reworked it today (linked above).

trinitronx commented 3 years ago

This idea seems like a generally good technique that could apply to other sites using m3u8 players that feed adds into the m3u8 playlist. It may help to allow the user to edit and add filter rules along the lines of AdBlock+ URL patterns, or simply Regular Expression pattern matching. Seems that this could translate well to make this plugin work as a more generic m3u8 ad blocker.

EDIT: Adblock apparently added a similar related feature in change #6592. Not sure how to use this, as it's not well documented yet. It looks like it's not actually rewriting or modifying an m3u8 file content, but instead probably rewrites the URL request for the m3u8 file itself. Seems similar to Apache's mod_rewrite type functionality, but for Adblock URLs. Seems like regex modification of the m3u8 file content itself could produce a better result by simply filtering out the ad roll elements before they get to the live stream player.