uBlockOrigin / uBlock-issues

This is the community-maintained issue tracker for uBlock Origin
https://github.com/gorhill/uBlock
935 stars 79 forks source link

Anti-redirect scriptlet for vk.com (href-sanitizer) #2531

Closed dimisa-RUAdList closed 1 year ago

dimisa-RUAdList commented 1 year ago

Prerequisites

I tried to reproduce the issue when...

Description

The site vk.com is one of the most visited Russian sites. All external links on this site are done through a redirect.

There is an idea to create a scriptlet with a name like "anti-redirect" based on this script: https://greasyfork.org/en/scripts/395977-vk-com-del-redir/code

A specific URL where the issue occurs.

https://vk.com/amedia.online

Steps to Reproduce

undefined

Expected behavior

undefined

Actual behavior

undefined

Configuration

uBlock Origin development build 1.47.5b2

gorhill commented 1 year ago

Your request is the equivalent of asking uBO to deal with redirect link, something which I have declined in the past, see https://github.com/uBlockOrigin/uBlock-issues/issues/1784.

Clear URLs is the right tool for this.

dimisa-RUAdList commented 1 year ago

The thing is, ClearURLs doesn't solve the problem on this site. The transition is still done via vk.com/away.php.

gorhill commented 1 year ago

If it doesn't work with ClearURLs then maybe it's best to report the issue on their repo?

dimisa-RUAdList commented 1 year ago

Thanks, but I thought it might be possible to solve this problem with uBO. At the moment, almost all such tasks on Russian sites are solved by the Counters filter. However, the current features of uBO do not allow me to do the same for vk.com. Maybe you still look at the script code?

gorhill commented 1 year ago

I prefer to decline, as explained in these similar requests:

Adding a scriptlet for this one particular case, beside that vk.com could change at any time (further adding burden here), it opens the door for all such requests in the future. This is best left to a dedicated extension, or even more simply, just a matter of using a user script extension.

MasterKia commented 1 year ago

@dimisa-RUAdList What about using that userscript in the userResourcesLocation of uBO? It needs a bit of modification. Then you can use a filter like vk.com##+js(anti-redirect) in your Counters list.

dimisa-RUAdList commented 1 year ago

@MasterKia Thank you, I have not yet given up hope of creating a solution using the default tools of uBO.

@gorhill I carefully reviewed all the rejected cases, including my own.

I think it is possible to create a solution that will be adequate and useful for many cases, although not for all.

I propose to create a scriptlet to modify the value of the href attribute. Its only function will be to replace the original href value with its text value.

Universal. No complicated and risky redirects, no additions, absolutely transparent, using only what already exists in the original layout.

Example: https://vk.com/amedia.online

Screenshot(s) ![vkhref](https://user-images.githubusercontent.com/20126984/224003869-d1d58d3c-42e0-4de8-b3d3-1074b7d56186.jpg)

Target: href="/away.php?to=https%3A%2F%2Famedia.online%2F&cc_key=" ~> https://amedia.online/

Solution: vk.com##+js(href-to-text, [href^="/away.php?to="])

MasterKia commented 1 year ago

I propose to create a scriptlet to modify the value of the href attribute. Its only function will be to replace the original href value with its text value.

Sounds similar to https://github.com/uBlockOrigin/uBlock-issues/issues/2347.

uBlock-user commented 1 year ago

uBO has a built-in url sanitiser deployed when the document request is blocked. So add ||vk.com/away.php^$doc and uBO will sanitise the URL and offer a clean URL on the document blocked page for the users to navigate directly.

dimisa-RUAdList commented 1 year ago

@uBlock-user Thank you, but no. If I add such a rule to the filters, it will only be a terrible irritant. This is not a solution. This is some kind of sophisticated way to motivate users to remove uBO.

For now, I'd rather wait for a response from Gorhill.

uBlock-user commented 1 year ago

This is some kind of sophisticated way to motivate users to remove uBO.

^^^ If that's how you feel about it, but Strict Blocking as a feature has been present in uBO for years and nobody removes uBO because of it, just so you know.

dimisa-RUAdList commented 1 year ago

@uBlock-user Thanks for the clarification, but I'm aware of how Strict Blocking works.

However, its infrequent triggers are not at all the same as a constant trigger on every external transition on one of the most visited Russian sites.

Do not be offended, but I have no purpose to continue the discussion of the solution you proposed. I'm just waiting for a response from Gorhill.

gorhill commented 1 year ago

I added a href-from-text scriptlet as suggested. I did not need much convincing for this idea because I went through the same idea earlier this week with the t.co debacle -- where I envisioned that it would have been nice if uBO replaced the t.co link with the text shown to user which is the actual URL -- in which case users of uBO would have been unaffected by the debacle.

Having more than one use case for such scriptlet, especially both from high traffic sites, is enough from me to want to try this -- let's just consider it experimental at this point in case there are serious issues with using such scriptlets identified later.

So for your use case, this filter works:

vk.com##+js(href-from-text, a[href^="/away.php?to="])

For Twitter, this works:

twitter.com##+js(href-from-text, a[href^="https://t.co/"])
MasterKia commented 1 year ago

To fix YouTube tracking user clicks of the video description links, I tried:

youtube.com##+js(href-from-text, a[href^="https://www.youtube.com/redirect?"])

It works; but when I click on the description link, YouTube opens the previous link with the help of a click event listener.

Example: https://www.youtube.com/watch?v=pAZWv8QQ8Co


Added to wiki: https://github.com/uBlockOrigin/uBlock-issues/wiki/Resources-Library#href-from-textjs-

gorhill commented 1 year ago

That case is Youtube warning that you are leaving the site, so it's not done silently, and because of this I would rather that we do not bypass these warnings.

MasterKia commented 1 year ago

@gorhill That warning only appears if the tracking parameters are removed, something which the AdGuard URL Tracking list declined to address.

So if you visit: https://www.youtube.com/redirect?event=video_description&redir_token=QUFFLUhqbG5ERk9rZVkwbzZFV1ZqekFrMEpYVVlzZFNyQXxBQ3Jtc0tucjFxYS1zT3cyczlVWWVzTW5qV1ZmaHFZaHFiLUpFUlpQSV8zNS0yaVdKczRXdlc0Nl82OU9VUlN1bjFvNDl6SElKQk9TdDBDT25ITVRhTnpzcy1JZ3lYVDdKWVBYcGhRSWpWblMtWkM3aXFoSUtIUQ&q=https%3A%2F%2Fwww.paypal.me%2Fpaysyfr&v=pAZWv8QQ8Co

It will silently redirect you to https://www.paypal.me/paysyfr and also tracks your click, no warning is given.

But if you add these to remove the tracking parameters:

||youtube.com/redirect?$removeparam=event
||youtube.com/redirect?$removeparam=redir_token
||youtube.com/redirect?$removeparam=v

Then it will give you a warning when you visit the link.

Anyway, I don't think it's something that can be addressed easily (other than messing with that click event listener?).

gorhill commented 1 year ago

Oh ok, I didn't realize this. Not sure if the issue with the event handler can be fixed, this will need some research.

dimisa-RUAdList commented 1 year ago

@gorhill I have found that in some cases, the link given in the text is shortened, and then the rule creates an invalid link.

The full link is contained in the title attribute.

Example: https://vk.com/amedia.online

Screenshot(s) ![hreffromtext](https://user-images.githubusercontent.com/20126984/224092205-381c014c-3ebd-4fd7-973b-83e1a5dca48f.jpg)

It looks like the scriptlet should be more flexible to be able to specify where exactly the link should come from.

Then the rules will be something like this: vk.com##+js(href-from-text, a[href^="/away.php?to="]:not([title]), text) vk.com##+js(href-from-text, a[href^="/away.php?to="][title], [title])

gorhill commented 1 year ago

Change in next dev build, however I renamed href-from-text to href-sanitizer. You can leave out the second argument to extract from text content (default).

dimisa-RUAdList commented 1 year ago

@gorhill Yes, everything is great now! The vast majority of links have become direct.

dimisa-RUAdList commented 1 year ago

@gorhill Sometimes a direct link is contained in an attribute on a child element.

Example: https://vk.com/vokrugsveta_vk

Screenshot(s) ![hreffromtext_child](https://user-images.githubusercontent.com/20126984/224164869-b84376d0-b897-400f-ab9a-0f9669530e9c.jpg)

Requires the ability to point to it: vk.com##+js(href-sanitizer, a[class="media_link__media"][href^="/away.php?to="], a[class="media_link__media"] > [data-link-url])

dimisa-RUAdList commented 1 year ago

@gorhill Can we expect the scriptlet to improve in terms of pointing to links contained in child elements?

krystian3w commented 1 year ago

Maybe failed on:

https://twitter.com/Post_Courier/status/1121677734940332032 but I can not improve like on VK (ate characters in DOM Tree?).

gorhill commented 1 year ago

Right, it's not doing well on Twitter, maybe something changed because when I looked at it during the debacle, I am pretty sure all links where shown, now they are no longer shown, there is just a "card" with no information whatsoever about the actual link.

stephenhawk8054 commented 1 year ago

Is it because the URL in the above twitter link is http: instead of https:? When I change the matching line

if ( /^https:\/\/./.test(text) === false ) { return ''; }

to

if ( /^https?:\/\/./.test(text) === false ) { return ''; }

then twitter.com##+js(href-sanitizer, a[href^="https://t.co/"]) works for me.

v1gnesh commented 1 year ago

Firstly, thank you for the tireless work from everyone involved in making the internet habitable.

I see that t.co thing doesn't work when the links show up in a tweet with the preview card. Example: https://twitter.com/VoltronData/status/1641145746639421442

The actual URL isn't in the text of the tweet, because the link preview card takes its place. Can this be made to work, by any chance?

krystian3w commented 1 year ago

For me it requires some kind of proxy. So beyond the scope of the experiment.

gorhill commented 1 year ago

Can this be made to work, by any chance?

Can't if there is no information in the DOM about the real link.

v1gnesh commented 1 year ago

Ok :(, thanks for responding!

MasterKia commented 1 year ago

Screenshot_20230330_220152

gorhill commented 1 year ago

That's just a domain name, that's not the actual destination URL.

Yuki2718 commented 1 year ago

@gorhill Will it be possible to replace with the part of href attribute? For example, Android version of https://www.mozilla.org/ja/firefox/browsers/mobile/android/ on my locale has this href: https://app.adjust.com/2uo1qc?redirect=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dorg.mozilla.firefox&campaign=www.mozilla.org&adgroup=mobile-android-page so the destination is included as the redirect parameter. I tried mozilla.org##+js(href-sanitizer, #playStoreLink-primary, [data-mozillaonline-link]) but somehow [data-mozillaonline-link] is a wrong url and thus doesn't work. Maybe something like mozilla.org##+js(href-sanitizer, #playStoreLink-primary, [href], ?redirect=) to specify the starging point of the desired text, or regex to more exactly specify part of the text? I see this is a differnt form of $queryjump.

stephenhawk8054 commented 1 year ago

^ I think integrating it to some future $queryjump is better? To reduce the complexity of the scriptlet.

Yuki2718 commented 1 year ago

adjust.com strict-blocking is a major annoyance at least in Japan, I received complaints from user and I myself occasionally come across this. Couldn't be happier if can be fixed earlier. Oh, tho I haven't checked other adjust.com links so far - need time as I don't remember where they were.

stephenhawk8054 commented 1 year ago

Yeah, I also thought a redirection like $queryjump idea that's compatible to strict-blocking would help reduce complaints. Or it would help many cases of email tracking like awstrack.com.

Or we can extend to some other strict-blocking filter without worrying about troubles in large sites.

But I think this needs another dedicated thread like removeparam.

krystian3w commented 1 year ago

https://github.com/uBlockOrigin/uBlock-issues/issues/2531#issuecomment-1512373333 - this can be added for users with European/American system/browser configurations:

https://user-images.githubusercontent.com/35370833/232686106-db8964bd-c6a6-4ad4-b514-769675cec983.png (works too on PC Firefox)

There will be a minimal reduction in reports, unless you need it in the Japanese AdGuard list instead of in a supplement https://github.com/Yuki2718/adblock2/blob/main/japanese/jpfp-ub.txt.

Yuki2718 commented 1 year ago

There will be a minimal reduction in reports, unless you need it in the Japanese AdGuard list instead of in a supplement

IDK what you mean.

krystian3w commented 1 year ago

What: I'm trying to say is that you can already add this first filter (mozilla.org##+js(href-sanitizer, #playStoreLink-primary, [data-mozillaonline-link])) to the AdGuard JP "database" or your private project - in order to reduce complaints on e.g. Twitter (with non-normal configured PC in Japan - e.g. user froced Polish in settings due Japanese is his second language e.g. Emil/Emma from YouTube (with the omission that they both know English quite well)), about not being able to download Firefox because Mozilla used ugly tracking links.

Yuki2718 commented 1 year ago

I tried mozilla.org##+js(href-sanitizer, #playStoreLink-primary, [data-mozillaonline-link]) but somehow [data-mozillaonline-link] is a wrong url and thus doesn't work.

MasterKia commented 1 year ago

@Yuki2718 After the new update, this works: mozilla.org##+js(href-sanitizer, a[href^="https://app.adjust.com/"][href*="?redirect="], ?redirect)

mapx- commented 1 year ago

@gorhill

html_sanitizer cannot be used on links like

https://www.jpvhub.com/download?link=vidoza.net/lfz41z8271nw.html

just because is missing https:// ?

test page (on download buttons) https://www.jpvhub.com/jp/9edgp69d8k/jav/%E6%9C%89%E4%BF%AE%E6%AD%A3/first-impression-azumi-ipz-094

jpvhub.com##+js(href-sanitizer, a[href^="/download?link="])
krystian3w commented 1 year ago

Test latest idea instead skip parameter:

jpvhub.com##+js(href-sanitizer, a[href^="/download?link="], ?link)
gorhill commented 1 year ago

Just because is missing https:// ?

Yes that's the issue (if you use ?link extra parameter). There is not way for the scriptlet to figure out whether what follows ?link= is a relative or absolute URL. To handle such case we would need a trusted scriptlet with more power.

dimisa-RUAdList commented 1 year ago

How to remove such a redirect? ~> https://yap.ru/forum2/topic2625123.html

Screenshot(s) ![Yap](https://github.com/uBlockOrigin/uBlock-issues/assets/20126984/1ab51257-6e49-4dee-86f2-c7b1e66e8b82)
uBlock-user commented 1 year ago

How to remove such a redirect? ~> https://yap.ru/forum2/topic2625123.html

There's no query param to extract url from. Need to modify the scriptlet to handle such cases.

Yuki2718 commented 8 months ago

Is it possible to add an argument to reverse the current behavior for ? and also support # i.e. instead of using text after ?, cut down text after specific symbol and use the rest of href? This will solve https://github.com/DandelionSprout/adfilt/discussions/163#discussioncomment-8248048 where links in "More from WIRED" are like https://www.wired.com/story/tech-layoffs-2024-amazon-google-discord-twitch/#intcid=_wired-bottom-recirc-version3_dd2b801a-5abd-47bf-b8d8-f12a1b71d254_roberta-similarity1. Or does it require totally differnt solution other than uritransform? I think uritransform is overkill for this.

stephenhawk8054 commented 6 months ago

About twitter t.co redirect, can anyone test if this filter works?

twitter.com,~platform.twitter.com##+js(trusted-replace-xhr-response, '/,"expanded_url":"([^"]+)","url":"[^"]+"/g', ',"expanded_url":"$1","url":"$1"', /api/graphql)
Yuki2718 commented 6 months ago

It's working.

stephenhawk8054 commented 6 months ago

Ok, I'll test it with dev build first

D4niloMR commented 6 months ago

On my end the matching url is https://api.twitter.com/graphql/7xflPyRiUxGVbJd4uWmbfg/TweetResultByRestId?variables=[...]

Screenshot ![image](https://github.com/uBlockOrigin/uBlock-issues/assets/70459964/09b1f43a-80cf-4ade-9738-a428bf989f9a)