brave / brave-browser

Brave browser for Android, iOS, Linux, macOS, Windows.
https://brave.com
Mozilla Public License 2.0
17.61k stars 2.28k forks source link

The debouncer prevents the query filter and HTTPS Everywhere from modifying URLs #22967

Open pitsi opened 2 years ago

pitsi commented 2 years ago

Description

I just noticed that brave no longer cleans the urls from facebook's tracking parameters. The url contains the ?fbclid= part which brave is supposed to remove. If my description is not good enough to describe it, I mean this feature here https://brave.com/privacy-updates/5-grab-bag/

Steps to Reproduce

  1. Log in to facebook
  2. Find a post that contains a link which leads outside of facebook, e.g. this one https://www.facebook.com/XBMC/posts/369892718499974
  3. In the new tab that opens, notice that its url contains the ?fbclid=alongstringoftrackinggarbage part.

Actual result:

The url has that ?fbclid part

Expected result:

Brave should remove the forementioned part of the url

Reproduces how often:

Almost every time.

Brave version (brave://version info)

Brave   1.38.119 Chromium: 101.0.4951.67 (Official Build) (64-bit) 
Revision    8888ee7a24e2c36661ddb9536c35b7d4852a3a98-refs/branch-heads/4951@{#1230}
OS  Linux

Version/Channel Information:

Other Additional Information:

Miscellaneous Information:

I can only test the forementione behavior on facebook, because I do not use any other social network. Moreover, I think there was a setting somewhere for it, and I can not find it now.

fmarier commented 2 years ago

I created a small test page and found that sometimes the trackers are removed from the query string (as expected) and sometimes they're not: https://fmarier.org/query-filter-test.html

This appears to be related to the URL debouncer. My theory is that the debouncing takes place after the query filter, and possibly all of the other request transformations we make.

pitsi commented 2 years ago

I would lie if I said I understand what you say, but here is what I see on my end. The link from your page to the kodi blog is cleaned from the fbclid part, while the one from facebook still isn't.

fmarier commented 2 years ago

@pitsi Excellent. You are seeing exactly the same thing as me then.

The reason for the problem with just the Facebook URLs is that (I think) there is another privacy feature in Brave which interferes with the one that removes the fbclid tracker.

fmarier commented 2 years ago

I updated my test page and confirmed looking at the network tab of the devtools and both the query filter and HTTPS Everywhere are broken when a URL is redirected due to a debounce.

This now uses a different debouncing rule since the Facebook one was removed in https://github.com/brave/adblock-lists/pull/862. When running in aggressive mode, the interstitial needs to be ignored due to https://github.com/brave/brave-browser/issues/22437.

fmarier commented 2 years ago

Thinking about this some more, it seems like the debouncer should be the very first URL modification to make for two reasons:

kjozwiak commented 2 years ago

Received a report from a user via https://twitter.com/realSamGamgee/status/1541818504801882114?s=20&t=pBYyFPNUgA26_6Ec6eo2uQ that could be related to the above. Seems like the ?fbclid= portion isn't being removed from Twitter links if redirected from FB. STR/Cases:

Example:

https://twitter.com/realSamGamgee/status/1541818504801882114?fbclid=IwAR04KRqcALlg6xYkG4h3GzfZ0O6qBF3idEzh9m78aKuA-aS1d19Qq2--QxA
fmarier commented 2 years ago

@kjozwiak I can also reproduce on Desktop. Can you please file a separate issue? This one doesn't have anything to do with the debouncer.

STR: Type https://twitter.com/realSamGamgee/status/1541819510935756802 in the URL bar.

kjozwiak commented 2 years ago

@kjozwiak I can also reproduce on Desktop. Can you please file a separate issue? This one doesn't have anything to do with the debouncer.

STR: Type https://twitter.com/realSamGamgee/status/1541819510935756802 in the URL bar.

Created https://github.com/brave/brave-browser/issues/23742 👍

pitsi commented 2 years ago

I read this a few minutes ago and I noticed that bleepingcomputer created a test page for testing firefox's new feature. Can anyone make a similar page for testing brave until this issue is resolved? In the forementioned page, brave removes the fbclid parameter. In fact, it removes every parameter except for the mkt_tok= one.

fmarier commented 2 years ago

@pitsi The mkt_tok one is now removed if you use Brave Nightly. It will make it to release in time. In terms of a test page, the one for Firefox works fine. But otherwise, you can just copy their page and add the missing parameters that Brave removes and that Firefox doesn't (e.g. gclid). The BleepingComputer test page doesn't trigger the bug that is discussed in this issue so you'll get accurate results.

pitsi commented 2 years ago

The BleepingComputer test page doesn't trigger the bug that is discussed in this issue so you'll get accurate results.

I did not know that. However, I would like to add that today's brave does clean the urls that have fbclid, so I assume the issue was fixed, but I don't know when. I can't say about tracking parameters from other sites. Feel free to close the issue report here if you want.

fmarier commented 2 years ago

The bug described in this issue only gets triggered when a URL is in our debounce list and contains a tracking parameter. For example https://dev-pages.brave.software/navigation-tracking/error.html?brave_testing=https://brave.com/?fbclid=1234

pitsi commented 2 years ago

Just a heads up for something I just found on hackernews. As it seems, facebook will change its fbclid with something that is impossible to remove from the url https://news.ycombinator.com/item?id=32117489

pitsi commented 2 years ago

I am having an issue with "upgrade connections to https", which does not seem to work on my end, and I would like to know if it is a side effect of the isuse above.

I first tried it on armbian's repo here and it did not switch to https. Then I tried it on some other page that I can not mention it in public, but it still did not work, so now I assume it is a new issue. http://armbian.systemonachip.net/apt/

p.s. Chromium does upgrade it to https, thus I am sure it is a brave issue.

nihil-admirari commented 5 months ago

The same issue is observed on DuckDuckGo if the scripts are disabled: https://github.com/brave/adblock-resources/issues/167.

  1. Block scripts.
  2. Search anything on DuckDuckGo, e.g. by typing :d Brave in omnibox.
  3. All links on the page are now of the form https://duckduckgo.com/l/?uddg=....

Default lists for Brave Browser have an entry for ‘uBlock Origin filters – Privacy’, which must handle click tracking on DuckDuckGo:

html.duckduckgo.com,lite.duckduckgo.com##+js(href-sanitizer, a[href^="//duckduckgo.com/l/?uddg="], ?uddg)

DuckDuckGo is not on a debounce list.

fmarier commented 5 months ago

@nihil-admirari What you're seeing is a different issue: the DDG redirects are not debounced.

It could either be fixed if we added DDG to the debounce list (as you pointed out), or if we supported the href-sanitizer scriptlet properly (the topic of the original issue you filed).

nihil-admirari commented 5 months ago

Are there any plans to add the support for href-sanitizers?

Out of curiosity: how many features of uBlock Origin are actually not supported by Brave's adblocker?