rustybird / qubes-app-split-browser

Tor Browser (or Firefox) in a Qubes OS disposable, with persistent bookmarks and login credentials
BSD Zero Clause License
44 stars 8 forks source link

[question/suggestion] Support for extension (+ settings) #26

Closed emanruse closed 10 months ago

emanruse commented 10 months ago

I have just found this interesting project.

Background for the issue

Currently, I use disposables based on whonix-ws-dvm. I customize the DVM:

http://www.dds6qkxpwdeubwucdiaord2xgbbeyds25rbsgr73tbfpqpt4a6vjwsyd.onion/wiki/Tor_Browser/Advanced_Users#DVM_Template_Customization

because there is no other way to both persist extensions and user.js. uBlock Origin and uMatrix are very good for additional privacy protection. However, the block lists they support need regular updating to keep them fresh. This requires regular starting of the DVM to update them. Additionally, this whole "discouraged" procedure of running torbrowser in the DVM makes it challenging to re-apply all the settings upon each update of Tor Browser (through Tor Browser downloader ran in the DVM).

Question / Suggestion

Is it possible to also install/configure/update extensions in such split-configuration?

Example usage:

  1. Have the software (torbrowser) in the whonix-ws-dvm, as usual, without customizing that DVM
  2. Then, have extensions (uBlock Origin, uMatrix) installed in another DVM with a mechanism to auto-refresh their block lists in the DVM. That same DVM can be used also for other browser customizations.
  3. Have as many DVMs, similar to that from 2, as necessary.
  4. Use the browser by starting disposables based on a DVM from 3.
  5. Ideally, somehow combine that with firewall restrictions (which I am still researching for). I am mentioning that because in Qubes it might be possible to have such firewall rules for each disposable (and/or DVM).

Note: I am aware that installing random extensions can compromise anonymity etc. if one doesn't know what one is doing. In the described scenario, I am rather envisioning usage with so called "hard blocking" rules for uBO/uM without JS, even with blocked (or limited to 1st party) CSS/Images, so "leaking" to other parties is extremely limited (much more than with default "stock" Tor Browser browser).

rustybird commented 10 months ago

Note: I am aware that installing random extensions can compromise anonymity etc.

This warning is not just about malicious extensions spying on you though. It's much bleaker: Any changes you make to how Tor Browser operates (as far as the website can observe) will generally decrease the size of your anonymity set, i.e. make you more fingerprintable.

An individually installed ad blocker is pretty much the worst case, because it changes what resources a website can request over the network, how the website is rendered, and how you interact with the website, all of which are observable and they're specific to the ad blocker, its version, the configured filter lists, and their version.

Of course I hate ads as much as anyone, and would love to block them. In a Tor Browser context, the only viable approach I can see is to try and blend in with the Tails crowd, by somehow replicating their exact Tor Browser setup (which ships an ad blocker). The extension and filter list versions should then simply be frozen to whatever they are in the latest Tails release. It's a lot of work... But it's orthogonal to Split Browser.

emanruse commented 10 months ago

Thanks for the feedback.

I am not a professional webmaster, though I have worked at least on a few front- and backend web apps, so I have some idea about how HTTP functions.

Assuming that:

I don't quite see how a website can fingerprint better a user who simply blocks 3rd party requests, unless the 1st party also owns the 3rd party and deliberately correlates all HTTP requests between the two hosts. That, however, would not only require significant additional resource$$$ but would also be illegal (against the GDPR, at least). I am not saying that being illegal would necessarily stop the wrongdoer (especially one with deep pockets). I am saying that the additional resources it requires would need to be very well justified by the result. If it is purely for marketing purposes, it seems almost completely unjustified. It is much cheaper though to do it through JS (as they do).

For non-marketing purposes: Of course, much more complicated attacks are possible, even by engaging multiple OSI layers, and then even Tor itself may not be able to help. However, a threat model like that cannot be stopped by not installing browser extensions, anyway.

What I mean here is: if an extension like uBO/uM is used to block 3rd party requests (not merely ads) and JS is disabled, that actually reduces the possibility of fingerprinting. Even the well-known panopticlick test proves it. Additionally, blocking requests to known malicious hosts helps further because it is surely much better not to communicate with a known bad host, no?

blend in with the Tails crowd

The first problem with all projects suggesting that approach is that the crowd is too small and that makes the whole value of it questionable. To my mind, if blending in the crowd should be considered, the crowd should be the biggest one (Android). And that's just a starting point.

The second problem is that having JS enabled opens such a huge door to fingerprinting (through mouse moves, typing patterns and what not), that any effort to blend in a crowd seems somewhat too optimistic.

I would be glad to know what you think about this.

adrelanos commented 10 months ago

I don't quite see how a website can fingerprint better a user who simply blocks 3rd party requests, unless the 1st party also owns the 3rd party and deliberately correlates all HTTP requests between the two hosts.

There's a lot stuff. You could look into what Tor Browser does and/or https://www.whonix.org/wiki/Data_Collection_Techniques

To my mind, if blending in the crowd should be considered, the crowd should be the biggest one (Android).

I don't see how blending in with the Android crowd would be technically possible. Technically it's not even a crowd. It's just a huge number of easily trackable individuals. I assume that most don't use any anti-fingerprinting which is consistent with real life surveys. The Android crowd isn't 1 shared identifier. So it's hard to blend in unless somehow simulating always being a unique, new, different fingerprinting. I don't know any project that works on that. Tor Browser is the biggest project working towards a shared fingerprint.

emanruse commented 10 months ago

There's a lot stuff. You could look into what Tor Browser does and/or https://www.whonix.org/wiki/Data_Collection_Techniques

Unfortunately, that general information does not answer the actual question. So far, for many years, I have not been able to find a single actual research or even a simple proof that having:

increases the fingerprint. I have only found proof of the opposite.

Example:

The test in panopticlick (now known as https://coveryourtracks.eff.org) shows that the above setup gives:

6.19 bits of identifying information

The "blended in the crowd" default settings of Whonix's Tor Browser (JS disabled too and "Safest" mode, everything else untouched, out of the box) give the exact same final result. The difference is that in this case there are not even warnings that one is taken to a tracker (even a simulated one) which I consider worse.

So, I am still looking to know how exactly communicating with Alice (1st party) but not communicating with Bob (3rd party) makes me more fingerprint-able to either of them (assuming they don't cooperate with each other).

I don't see how blending in with the Android crowd would be technically possible.

Perhaps through sophisticated mimicry. There are also other bigger-than-Whonix crowds which may be more possible.

Technically it's not even a crowd. It's just a huge number of easily trackable individuals. I assume that most don't use any anti-fingerprinting which is consistent with real life surveys.

What is the evidence that most Tor/Whonix users are all educated in how not to be tracked easily? The IP address being anonymized and sessions being non-persistent just is a tool, a good starting point. How one proceeds from there is all that matters for the final result.

The Android crowd isn't 1 shared identifier. So it's hard to blend in unless somehow simulating always being a unique, new, different fingerprinting.

So, you answered yourself. Noise.

I don't know any project that works on that. Tor Browser is the biggest project working towards a shared fingerprint.

Biggest != perfection reached.

adrelanos commented 10 months ago

emanruse:

There's a lot stuff. You could look into what Tor Browser does and/or https://www.whonix.org/wiki/Data_Collection_Techniques

Unfortunately, that general information does not answer the actual question. So far, for many years, I have not been able to find a single actual research or even a simple proof that having:

  • JS disabled
  • 3rd party resources blocked
  • tracking hosts blocked

increases the fingerprint. I have only found proof of the opposite.

Oops. I misread the original. I missed reading "better". My answer was for "fingerprint without JavaScript", which is obviously possible: cookies don't require JS.

But you asked about "better fingerprint". No, I didn't say one gets a more fingerprintable when disabling JS.

https://support.torproject.org/tbb/tbb-34/

I don't see how blending in with the Android crowd would be technically possible.

Perhaps through sophisticated mimicry.

Noise.

Biggest != perfection reached.

In theory, yes. In practice, (HTTP) web browsers are complex beasts, mimicking operating systems. Millions of lines of code, constantly changing and privacy isn't a top priority for upstream.

It's hard to tame browsers. Librefox, arkenfox, Mullvad Browser, Tor Browser... All with tons of open issues even trying to reduce fingerprinting / having a shared fingerprint. Mimicry would be even harder than a shared fingerprint.

So I am afraid I find it unlikely. But happy to be wrong should such a project emerge.

It might be slightly more likely that an alternative Internet that is not based on modern web standard, HTTP emerges. Perhaps Gopher or something similar to it.

https://en.wikipedia.org/wiki/Gopher_(protocol)

Since the protocol is "1000 times" simpler, it seems much easier to have a shared fingerprint or maybe no fingerprint at all because all users using that look alike.

Technically it's not even a crowd. It's just a huge number of easily trackable individuals. I assume that most don't use any anti-fingerprinting which is consistent with real life surveys.

What is the evidence that most Tor/Whonix users are all educated in how not to be tracked easily?

Not sure how that topic came up but I don't have data on that. Additionally it's hard to prove a negative "not easily tracked". If it was the case then it would probably be more likely that some research paper "Tor Browser users are easily tracked" exited.

By comparison, a normal Android user using a stock ROM as far most users do is "100%" trackable even with simple cookies. Even worse if browser is logged into Google account. Not even supercookies. Not even JS based fingerprinting needed. Tracking default Tor Browser is harder due to disabling lots of tracking.

emanruse commented 10 months ago

The question is:

(A) Untouched Whonix's Tor Browser (only JS disabled)

(B) Customized Whonix's Tor Browser:

How exactly is (B) worse than (A)?

adrelanos commented 10 months ago

I am not saying it's worse. It would require research to compare that.

Reason why Whonix only sets environment variables, doesn't customize Tor Browser files / recompile, are documented here: https://www.whonix.org/wiki/Tor_Browser#You_should_Disable_JavaScript_by_Default!

(Updated just now.)

emanruse commented 10 months ago

I am not saying it's worse.

@rustybird wrote it is the worst:

An individually installed ad blocker is pretty much the worst case [...]

(... because it is supposed to make fingerprinting easier)

You supported that claim linking to an article with general info about fingerprinting which says nothing about the particular use case.

Now you say:

It would require research to compare that.

I shared the result of my own research and it shows that (B) is actually better, as it allows more control. If there is no research supporting the opposite, I don't see why something better should be quickly discarded because of well-known general recommendations.

I am not saying that the general recommendations are bad. I am saying that the particular use case considers them, so it per se is worth considering too.

adrelanos commented 10 months ago

When I say research, I mean something like https://www.freehaven.net/anonbib/ - high quality, comprehensive, a structured approach, defined methods, mostly group effort, group consensus, discussed with other researchers, ideally peer reviewed.

Since this is complex stuff, I am not touching it but referring to The Tor Project (TPO) because they're working with researchers.

If that interests you, I suggest to get in touch with other researchers. Good research could actually result in TPO to changing the browser defaults.

emanruse commented 10 months ago

I don't see why I should engage into academic-level research to show something which I have already shown and which anyone can verify. It seems there is simply a big misunderstanding here.

  1. This issue is in regards to the current project.

Along the lines of "do one thing", I assumed this project is not supposed to take care of how and why people use Tor Browser, and protect them from "bad practices", because anyone, willing to install any extension, can do it regardless of the current project.

The fact is that a lot of people actually do install uBO, regardless of the well-known warning. Even Tails includes uBO, as noted. The Tor Project does not block the possibility to do this. So, the current suggestion was not planning to change that.

  1. There is a difference between marketing and state actor fingerprinting.

The demonstrated (B) case shows that for general purpose browsing it is actually much safer than the "clean and untouched" default. The other case has been noted as a separate one too. The Tor Project's warning is about that other case, assuming it is safer to warn everyone. However, most people don't fall in that category, so (IMO) it becomes a gaslighting with the opposite result.

Considering that and Qubes architecture, it is easy to have separate DVMs for each case and use them wisely. The current project may make (B) easier. It is rejected as not planned. - No worries. Just wanted to clarify the reasons.

FWIW, the author of uBO has offered his help to The Tor Project in regards to improving uBO further. This whole issue is worth reading:

https://github.com/uBlockOrigin/uBlock-issues/issues/1121#issuecomment-647131828