arkenfox / user.js

Firefox privacy, security and anti-tracking: a comprehensive user.js template for configuration and hardening
MIT License
10.13k stars 516 forks source link

Beacon API - don't bother? #1586

Closed fxbrit closed 1 year ago

fxbrit commented 1 year ago

I think we should get rid of 2602, the analytics it can deliver can be obtained and shared elsewhere so there's no net gain; on the flip side, it might impact performance and we had a couple reports of breakage in the past.

bonus: one less flip, one less API disabled, arguable fingerprint gain.

sources:

Thorin-Oakenpants commented 1 year ago

so there's no net gain

This is the same argument Mozilla made for why they would keep the API and why it would be default enabled (not going to bother to look it up). If it's not that important, then why not get rid of it?

so did you check the entire internet and discover that no-one uses beacon without the other "elsewhere"

There's a big difference between scratching your ass and digging a huge hole in it. Let's not give away "free easy stuff"

it might impact performance

I don't see how

and we had a couple reports of breakage in the past

Really, seriously? Fuck those two sites then

Thorin-Oakenpants commented 1 year ago

also: see 40783#note_2854455 and pierov's comment in bold (while it lasts)

arguable fingerprint gain

FYI: it's the only item in navigator that we disable - easily discerned. But of course we're not going to beat advanced scripts in FF

fxbrit commented 1 year ago

If it's not that important, then why not get rid of it?

I don't see how

MDN says:

With the sendBeacon() method, the data is transmitted asynchronously when the user agent has an opportunity to do so, without delaying unload or the next navigation. This means:

  • The data is sent reliably
  • It's sent asynchronously
  • It doesn't impact the loading of the next page

there's also more in the w3 link, specifically:

The delivered data might contain potentially sensitive information, for example, data about a user's activity on a web page, to a server. While this can have privacy implications for the user, existing methods, such as scripted form-submit, image beacons, and XHR/fetch requests provide similar capabilities, but come with various and costly performance tradeoffs

it's an async way to do something websites can do anyway, just in a sync manner (hence impacting performance). I think that's what mozilla also meant when they told you "no net gain": it's not that by disabling the API that information becomes unavailable.

but I'm curious to hear what you and eventually pierov think of it, it's totally possible I'm missing something obvious.

Thorin-Oakenpants commented 1 year ago

I'll reopen as a reminder, not as a sign that I'm going to change it. I am of the firm position that disabling it removes most of it (since the alternatives are as per the MDN link: legacy and wonky)

perf: OK, but I doubt you would even notice it because it's disabled so nothing is trying to be sent (because the alt methods are legacy and problematic): especially with uBO blocking tracking and analytic scripts anyway

Thorin-Oakenpants commented 1 year ago

https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/41467

Thorin-Oakenpants commented 1 year ago

so according to ma1 there's no specific threat (and TB blocks it already via NS)

and the next comment kinda sums it up

The bold is mine. So what is so important about the unload event that it gains? For analytics I get it. Anything "evil" here is probably already collected (and sent), unless I'm missing something. Maybe a link click to open in the same window tab? Big deal. I'm not actually seeing any real threat here, except it doesn't break anything IMO (fuck those two sites)

class, discuss

fxbrit commented 1 year ago

I think ma1 is spot on and I vote for my proposal 🙋‍♂️

BillyJoeJimBob commented 1 year ago

For what it's worth, see comment 6:

The question of whether to implement an API like sendBeacon() is not related to privacy at all, rather it is about performance and user experience. There are two things to pay attention to here:

A) The sendBeacon() API does not provide any additional capabilities to the page in terms of connecting to a remote analytics server in order to submit some analytics data (which may contain some cross-site tracking identifiers inside it too.) Before this API was available, pages used to implement the same functionality through mechanisms such as using synchronous XHR in unload event handlers or asynchronous XHR with an artificial timeout delaying the rest of the processing of the page to ensure that the analytics ping has been submitted to the server.

B) The sendBeacon() API merely provides an asynchronous non-blocking feature for pages to do exactly what they were already capable of doing. But in doing so, the browser can now take over the submission of this ping and do so without blocking the execution of the page or delaying some part of it.

Therefore, through providing sendBeacon(), we have replaced a slow, user-hostile mechanism for sending analytics pings with a fast user-friendly one. From a privacy perspective, nothing changed after we implemented the sendBeacon API.

More importantly, note that most pages use feature detection to decide how to submit their analytics pings, and if they detect that the browser doesn't support their preferred mechanism they fall back to the less preferred ones. So disabling sendBeacon in practice would put most pages using this feature on the slower code path they were previously using to do exactly what they were doing before. That means that it would provide zero privacy benefits, and would hurt the performance that the user would experience.

Thorin-Oakenpants commented 1 year ago

Yup, read that before - good reference. I don't agree with the last paragraph much, now that four or five years have passed. I think most sites have stripped out the older methods.

I liken this to 3p site data: oooh we need to block it .. no we don't, we have partitioning (and sanitizing). This is much the same, as in the proper solution lies elsewhere - a bit like referrers: not the best example, but if you are masking your IP, then the data is useless - you're just the same as any other user of that page (not to be confused with linkability of cross-site traffic via navigational tracking)

I still think beacon is a solution: just not a complete one

Thorin-Oakenpants commented 1 year ago

https://github.com/arkenfox/user.js/pull/1592/commits/6277d98539fb634df84d5d858465297e049178f3

Thorin-Oakenpants commented 1 year ago

lols - https://github.com/mozilla/standards-positions/issues/703 .. PendingBeacon .. so even when you crash they still get their analytics (I didn't read the spec)

fxbrit commented 1 year ago

oh look the spec has a list of Beacon alternatives that devs can use -> https://github.com/WICG/pending-beacon#problem-and-motivation

mik0l commented 1 year ago

Examples: https://vexell.ru/files/testpool/

WhyNotHugo commented 1 year ago

From what I can tell, Tor enables the API, but silently discards any data that trackers try to leak through it. Is this not feasible for Arkenfox?

Thorin-Oakenpants commented 1 year ago

^ read the thread - it already says that NS does this

There is also no real threat here as anything useful can already be sent without beacon

WhyNotHugo commented 1 year ago

There is also no real threat here as anything useful can already be sent without beacon

A key difference is visibility. I can open the inspector tab and see HTTP requests made by a page, but these beacon requests are not tied to the page. If one is triggered when closing it, it's not as easy to see.

Thorin-Oakenpants commented 1 year ago

that has nothing to do with the threat

WhyNotHugo commented 1 year ago

What threat?

Thorin-Oakenpants commented 1 year ago

you tell me - you started this up again by asking if arkenfox could do something - so clearly you must have a reason. If there is no threat then why did you ask?