ajayyy / SponsorBlock

Skip YouTube video sponsors (browser extension)
https://sponsor.ajay.app
GNU General Public License v3.0
9.79k stars 317 forks source link

Visual Segments: Blocking out social media icons and other intrusive visual content #538

Open LoganDark opened 3 years ago

LoganDark commented 3 years ago

SponsorBlock is invaluable when a portion of a video is entirely unwanted. However, some creators prefer to sprinkle subscribe buttons and social media icons throughout their video, during otherwise unrelated speech or visual activity.

Trinkets like those prefer to collect in the most inconvenient places, in the middle of useful or engaging information. Viewers generally still want that information, so skipping the section completely would not be very helpful. Example of a video utilizing trinkets: the server advertisement is a dedicated section that can be cut out with a self-promotion segment, which I have already submitted. However, the twitter trinket, which is followed by an instagram trinket, occurs in the middle of speech/gameplay that should not just be cut out.

I would like to propose a new feature for SponsorBlock: visual segments. Visual segments are not to be skipped like other segments are, but they represent a time window for a particular trinket - defined as a piece of intrusive visual content, like a subscribe button or social media icon, present during an otherwise engaging or important part of the video.

Each visual segment would contain exactly one obstruction, which would be overlaid on top of the video content for the duration of the segment. Of course, it's not nearly as seamless as completely skipping an obnoxious section, and it's pretty obvious that there's something beneath the obstruction - but viewers won't have to see what's behind it if they don't want to, and they will still not miss out on an important section of the video.

Obstructions are solid shapes (consisting of a bounding rectangle and the name of the primitive to draw, such as 'box' or 'circle'). The submitter of the obstruction can define a solid color, which defaults to black, but that's it. Obstruction bounds would be defined using percentages.

Viewers can enable or disable obstructions just like annotations. Viewers can also, in the extension settings, select if they would like to overlay the defined obstruction color, or blur obstructed parts instead (achievable with backdrop-filter).

The only obstruction shapes to start would be box (square/rectangle) and circle (circle/ellipse). There's no need for fancy shapes like stars or arbitrary polygons or even rounded rectangles.

To define a visual segment, you would first select its type as 'visual segment', then you are given a basic plotting UI to define one (and only one) obstruction. It's probably up for debate how to plot multiple overlapping visual segments at once; you'd likely only do one at a time by selecting which one the plotter is using. It sounds like it would be rare to see multiple social media icons on different parts of the screen at the same time... but it's actually not so rare to see, for example, one icon in each corner, during outros and such.

Resolved problems:

Unresolved problems:

Right now the questions largely outweigh the answers, so this is mostly a request for comment at the moment. I'd love to work out a more complete architecture for this in a way that most benefits the SponsorBlock project. I'd love for the extension developer(s) to weigh in on this and provide feedback, and I'd love to see a feature like this added to SponsorBlock eventually. Thank you for reading :D

ajayyy commented 3 years ago

Semi-related: https://github.com/ajayyy/SponsorBlock/issues/96

LoganDark commented 3 years ago

Yeah, probably. This is a bit more fleshed out and dedicated to one feature. A good format for a GitHub issue, I'd imagine.

ajayyy commented 3 years ago

Yep, just want to link previous discussion

LoganDark commented 3 years ago

Well, what do you think?

LoganDark commented 3 years ago

image

:(

ajayyy commented 3 years ago

I definitely think this is a cool idea, but it may probably is a little out of scope right now. Maybe as a long term goal.

Now that SponsorBlock is old, it does have many people with high reputation, so "dangerous" features like this and #409 can be limited to only those with very high reputation

ajayyy commented 3 years ago

To work best, they would probably have to be replaced with an image, not just a shape, but that might be too much work to be viable.

Some way to have the "annotations" move smoothly across the screen over time might be needed as well.

LoganDark commented 3 years ago

So far I've submitted 156 segments and saved others about 60 hours. Does that count as high reputation? "Very high reputation" worries me, I wouldn't set the bar too high. I've been submitting segments for about a month so far, on pretty much every video I watch.

To work best, they would probably have to be replaced with an image, not just a shape, but that might be too much work to be viable.

Instead of specifying a solid color, you could have the extension take a screenshot of that portion of the video, and display it for the duration of the visual segment. It's easily something that could happen in an update, rather than on first release.

Some way to have the "annotations" move smoothly across the screen over time might be needed as well.

Careful with this. I intentionally kept this as primitive as possible, to hopefully prevent revealing more than necessary about the underlying distraction.

This is one of the only extensions to the idea I would theoretically approve of (if I were the one calling the shots, which I'm not), but again like other aspects of the 'plotter' the way this would be presented to the editor is completely undefined. You could implement an entire keyframe system, or just have a start and end point and an animation curve (probably the way I would do it). Allow the curve to be edited manually, but provide a couple presets. This adds additional complexity, beware.

Of course, this obstruction system doesn't even touch things like sound effects... but it would be much more trouble than it's worth to try to hammer out a solution to somehow lessen that without affecting the surrounding audio in a negative way. It would also require SponsorBlock to start processing all audio which comes out of the player which is probably even more complicated than visual segments would be. Mute segments are mildly less complex than that, but still not optimal.

Finding the optimal solution is most likely not possible, and like the rest of SponsorBlock, a compromise needs to be found.

LoganDark commented 3 years ago

Keep in mind that once most of the concerns are worked out, since I'm a programmer myself I could look into implementing this all as a pull request. I just don't want to put work into a plotter that you don't like, or an animation system that you don't like.

ajayyy commented 3 years ago

I agree with your comment about audio, I don't think that is something feasible, especially with just a browser extension.

ajayyy commented 3 years ago

I feel like this may be useful getting the opinions of others on this. Should I make a channel/room on Discord/Matrix to discuss this?

ajayyy commented 3 years ago

Also, on the example link you showed in the first post, I'm not sure just covering it with a screenshot of the video will help much.

LoganDark commented 3 years ago

I feel like this may be useful getting the opinions of others on this. Should I make a channel/room on Discord/Matrix to discuss this?

Sure.

Also, on the example link you showed in the first post, I'm not sure just covering it with a screenshot of the video will help much.

It also won't help much to ask users to invent their own image to overlay. Not to mention the abuse that could bring.

You'd also either have to store every image that they upload, or have their computer make requests to an arbitrary url (BAD BAD BAD BAD), or proxy it through your backend (less bad but still dangerous)

ajayyy commented 3 years ago

You'd also either have to store every image that they upload, or have their computer make requests to an arbitrary url (BAD BAD BAD BAD), or proxy it through your backend (less bad but still dangerous)

While anonymous image upload is scary, I'm not sure if there is another viable way. We've already experienced severe ratelimiting just downloading captions for use with https://github.com/andrewzlee/NeuralBlock and had to shut it down until we find another solution.

I doubt we would be able to download videos on the server to be able to take screenshots server-side, they would have to be done client side.

Could you send an example of what you think the example video should look like to hide the obstructions?

LoganDark commented 3 years ago

While anonymous image upload is scary, I'm not sure if another viable way.

You could not have users provide custom images? That's kind of what I was hinting at in my comment.

I doubt we would be able to download videos on the server to be able to take screenshots server-side, they would have to be done client side.

That's not at all what I mean. The client is already playing the video, so when a visual segment is encountered they could just display a portion of the frame that it started on, on top of the video until the segment is over.

It's hard to explain over text, but let me try to show what I mean.

Control Solid color Persist

This is not a job for the server anyway. The server should only store data about the shape of the obstruction.

This could be what you were thinking of when you said "custom images". As a plus, the server does not have to do any extra work, and you don't have to think about image uploads. You do have to work out how to display that, though... possibly problems when you skip to the middle of a visual segment and the client does not have that frame in memory. You might just have to use black in that case.

ajayyy commented 3 years ago

I see, that makes more sense than what I was thinking...

LoganDark commented 3 years ago

Slight downside: people may think it's an encoding/compression issue, or the fault of YouTube/the creator. It looks a lot like an artifact.

ajayyy commented 3 years ago

Yes, I'm not sure if it really looks any better than without it being covered, but it would make sense for banner ads.

LoganDark commented 3 years ago

Well, for me, I'd gladly sacrifice FMV in that one area just to avoid seeing that like button. I don't know if you know this (or if everyone does), but things like this absolutely piss me off when I see them in a YouTube video. Sure, using SponsorBlock to attempt to skip it (which I have done for that particular video; here) is a fine solution... in the one case where the entirety of its screen time is spent saying "wow" and "ooh" and "so cool", basically unimportant to the video, and not even missed (or noticed) when it's skipped over.

That image I posted up above (the red "BE SURE TO SUBSCRIBE") happens during the video introduction, which is important content that can not be skipped over. That is where visual segments will shine. Maybe it does count as a banner ad; not sure what your definition is.

As said up above, once the details are worked out, I will happily produce code towards an implementation because of how strongly I feel about this. SponsorBlock is absolutely invaluable, THANK YOU for making YouTube more bearable. (And for letting me take out my anger by making perfectly-timed segments to skip over the offending content, rather than disliking the video and yelling in the comments section.)

ajayyy commented 3 years ago

While you might implement it now, it still has to be maintained long-term, so we still have to make sure it's useful.

I guess we'd not use the same category system for this. This one probably only needs sponsor, self promo, and interaction.

Maybe it does count as a banner ad; not sure what your definition is

I was referring to actual ads (sponsor)

LoganDark commented 3 years ago

That might be a good idea. The current category system only deals with time-skipping segments, but these ones aren't meant to be skipped (in fact, they're an indication that the section IS important and CAN'T be skipped, but there is something annoying on-screen), so it could be a different API that clients (like the extension) can opt-into by manually retrieving the data.

While you might implement it now, it still has to be maintained long-term, so we still have to make sure it's useful.

That's true. Additional reasons to keep it around are always welcome. I believe I've made a good case for it so far, as long as the implementation details (which haven't entirely been decided yet) turn out well.

ajayyy commented 3 years ago

I made the channel as #visual-overlay-discussion. Matrix / Discord

ajayyy commented 3 years ago

UI idea: Add a visual option to the category selector in the submission notice. When selected, an extra option will appear allowing x and y to be selected.

LoganDark commented 3 years ago

an extra option will appear allowing x and y to be selected.

I'd prefer for it to be a little more like ShareX's region select, where you can click and drag over the video to draw your shape, and tweak it afterwards with movement/resize handles. Unless that's what you mean by x and y...

basilevs commented 2 years ago

I very often use YouTube in audio only mode and like small and focused plugins. A feature like this definitely bloats my favorite plugin, brings no value and fits nicely into a separate dedicated plugin. Please do not include it in SponsorBlock.

LoganDark commented 2 years ago

I very often use YouTube in audio only mode and like small and focused plugins. A feature like this definitely bloats my favorite plugin, brings no value and fits nicely into a separate dedicated plugin. Please do not include it in SponsorBlock.

SponsorBlock's purpose extends past podcasts and music videos. I also use YouTube in the background, but I mainly use it when watching visual content.

bloats my favorite plugin, brings no value and fits nicely into a separate dedicated plugin

Bloat: Only slightly.

Brings no value: If you're not watching the video, then visual segments have no use for you, of course. Mute segments don't either because the visual information those mean to preserve aren't being consumed by you, so they may as well just be skipped.

Fits nicely into a separate dedicated plugin: Meh. SponsorBlock has the database, infrastructure and community to ensure that this sort of thing would gain traction, but other projects don't. If I were to code all this up myself into a new extension for example, noone would use it.

basilevs commented 2 years ago

SponsorBlock has the database, infrastructure and community to ensure that this sort of thing would gain traction, but other projects don't. If I were to code all this up myself into a new extension for example, noone would use it.

Another plugin could share infrastructure and community if not database.

ajayyy commented 2 years ago

Blocking visual sponsors is useless without also being able to auto-skip, so would not make sense as a dedicated plugin.

jac0b-w commented 2 years ago

Instead of a solid color shape or an image covering I think a blur type effect where the colours are sampled from just outside the bounding box of the trinket would look pretty seamless most of the time. Would definitely be a bit harder to implement and would also have to account for any of the edges of the bounding box being on the edge of the video though.

LoganDark commented 2 years ago

a blur type effect where the colours are sampled from just outside the bounding box of the trinket would look pretty seamless most of the time

So like that tunnel filter, but reversed?

jac0b-w commented 2 years ago

Sorry I'm not familiar with effect of a tunnel filter. Instead of sampling the pixels directly behind the blur filter like you would typically expect you sample the pixels around the edge of the bounding box to be blurred.

A very crude implementation could be as simple as this: say you have the bounding box of a trinket which is 100px vertically. You take a 50px vertically above and below the trinket then append the images together then blur that new image and apply it over the trinket.

Hope that makes the basic idea clear.

LoganDark commented 2 years ago

Sorry I'm not familiar with effect of a tunnel filter. Instead of sampling the pixels directly behind the blur filter like you would typically expect you sample the pixels around the edge of the bounding box to be blurred.

A very crude implementation could be as simple as this: say you have the bounding box of a trinket which is 100px vertically. You take a 50px vertically above and below the trinket then append the images together then blur that new image and apply it over the trinket.

Hope that makes the basic idea clear.

Photo Booth (Apple program) has a filter that basically only reveals a circle, and then stretches out the edges of the circle to fill the entire rest of the image. Reversed, that would be filling in a circle by stretching its edges inward.

jac0b-w commented 2 years ago

Photo Booth (Apple program) has a filter that basically only reveals a circle, and then stretches out the edges of the circle to fill the entire rest of the image. Reversed, that would be filling in a circle by stretching its edges inward.

Yes this sounds right!

ajayyy commented 2 years ago

A priority list to make the first implementation less over-complex:

Priorities: Focus on fixing transitions > covering logos/icons. Simpler problem with a more clear goal, can easily be extended to logo covering.

1.

mchangrh commented 1 year ago

I have a working demo with a copied 2D canvas in the discord


latency

unfortunately there is latency associated with it, the refresh rate is tied to the draw interval. The lower the draw interval, the higher the memory/cpu usage, naturally. Ideally it would be synced to the video but when set up to integers, there can be desync up to 3 frames just because of frame timings. At best there is 1 process time of delay, which might be mitigated at running at 2x refresh rate

ajayyy commented 1 year ago

I remember seeing a demo in one of the old links I posted where you can grab the video frame early to defeat the latency. double buffering