QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
533 stars 46 forks source link

Possible tracking of DVMs #5764

Closed e6lk7dqzm83p closed 4 years ago

e6lk7dqzm83p commented 4 years ago

Qubes OS version 4.0

Affected component(s) or functionality Disposable VMs

Brief summary I was accidentally booted from a Zoom meeting by a friend and I decided to see if I could rejoin from a new DVM with a new IP address (used a different VPN server), when I tried to rejoin (with a different user name) I got an error that I had already been booted from the meeting. Given how DVMs should be working this should not be possible. It appears that Zoom (at the server side) is some how able to track between different DVMs (even with different underlying OSes).

To Reproduce Start a DVM, install Zoom Join a meeting Get booted from the meeting Start a new DVM with an new IP address, install Zoom Try to join a meeting

Expected behavior Zoom cannot track behavior between different DVMs

Actual behavior Zoom seems to be able to track behavior

Additional context I tried this with multiple DVMs with different IP addresses (Fedora and Debian) and with Whonix, none were allowed to rejoin. The meeting ID is not unique for participants and I was able to rejoin from a VPN via my cellphone using the same meeting ID (i.e. the meeting isn't locked).

My VPN is run via Tasket's Qubes-vpn-support (https://github.com/tasket/Qubes-vpn-support).

marmarek commented 4 years ago

This indeed looks scary, but I think Zoom is using some kind of heuristic not based on host identity. Here is some hint: "By default, if you remove participants or panelists from the webinar, they won't be able to rejoin using the same email address." I guess if you are joining without registration, they apply some more restrictions. I have tried to join anonymously from another physical machine, with different system, different network connection and still weren't allowed in. I guess it is based on connection time (like - prevent joining anonymous users for X minutes). I weren't able to reliably test how it behave for registered users (freshly created account using a disposable email address wasn't enough, but those are trivial to detect). So, yes, this looks like quite reliable in practice system tracking, but it is more likely basic analyzing of conference participants behavior - something that is mostly independent of your system.

@adrelanos have you analyzed similar cases by a chance?

e6lk7dqzm83p commented 4 years ago

@marmarek thanks for that, I assumed it was something like that (I didn't try it on my Windows VM so all the clients were Linux based), but figured I should mention it just in case. I found it quite disconcerting!

andrewdavidwong commented 4 years ago

Also, keep in mind that non-Whonix DisposableVMs make do not claim to provide any special privacy (as opposed to security) properties. Fingerprinting across non-Whonix DisposableVMs is likely to be feasible. Why? Because ensuring that a VM enforces strong privacy standards is very difficult, and Whonix specializes in this. Given the difficulty, it doesn't make sense to try to reinvent the wheel outside of Whonix, so the standard answer is: If you need privacy, use Whonix. If you use something else (like a regular non-Whonix DisposableVM), don't expect it.

(I know that you said you also tried Whonix and got the same result. My comments are in reply only to remarks about DisposableVMs in general, e.g., "Given how DVMs should be working this should not be possible.")

In light of Marek's response and these considerations, I'm closing this as "not an issue." If you believe this is a mistake, please leave a comment, and we'll be happy to take another look. Thank you.

adrelanos commented 4 years ago

This is similar to: https://github.com/QubesOS/qubes-issues/issues/5629 (teamviewer related)

My personal, non-Qubes project related view point:

Both, zoom and teamviewer are non-freedom, proprietary, closed source, commercial applications. Therefore my interest in these is very limited. See: https://www.whonix.org/wiki/Avoid_nonfree_software

It's important to define goals and non-goals of Whonix. Added here just now.

Quote https://www.whonix.org/wiki/Dev/Technical_Introduction#Goals_and_Non-Goals

  • Tor Browser development non-goal: make every user look like someone unique else.
  • Tor Browser development goal: to make everyone look the same from the perspective of destination websites; hiding the fact that someone is using Tor from destination websites.

Whonix by concept is an extension to Tor Browser expanding these goals to the whole operating system.

Goals are restricted not by what developers would ideally want to provide but by technical challenges and what's realistic.

My answer therefore is:

This boils down to a project request to be able locally install and run applications with malicious / anti-features while being able to switch identity to someone else on demand. I don't think this is realistic / likely to happen in mid term future.

As a precondition, we would have to better understand and control existing non-deterministic artifacts (log files, time stamps, ...) on the disk. The reproducible builds movement might make packages 100% reproducible, i.e. not contain any non-deterministic artifacts. Hopefully even reproducible iso images, template images.

But once an installed operating system (TemplateVM) gets booted, non-deterministic artifacts are be created. That would have to be fixed then too. Two people who upgrade an operating system (TemplateVM) but didn't make any other changes there should result in in a deterministic, byte for byte identical TemplateVM (at least root) images. Also related to stateless operating systems.

Maybe then from the perspective of malicious software running inside a DispVM trying to find uniqueness would find no uniqueness and everyone would look the same. Maybe then two Qubes DispVM users couldn't use $application as in this example since they would be considered being the same user even if they are not.

What might happen then depending on the amount of users with such setups is that such users are simply ignored. And if not ignored, they might require resort to other uniqueness checks. Also not only non-deterministic artifacts on the disk can be used for uniqueness detection also benchmarking of the system could be abused for fingerprinting. Or perhaps require voice / face detection. Can always be claimed on ddos / spam protection. A lot effort which would eventually be easily nullified. Therefore intentions matter. If they don't want you to use it, just leave them alone. Support Freedom Software instead.

e6lk7dqzm83p commented 4 years ago

The concern here is that using DVMs with different Linux Distros (Debian, Fedora) and Whonix all with different IPs over the span of an hour or so were identified by Zoom. Most importantly, is there something inherently wrong with Qubes/Whonix. You all seem comfortable that it's not, which is very good.

The next question is even if it's heuristically based, is this something to be concerned about. I'm not sure what or if anything can be done about it, but it is definitely a privacy issue. It may be worthwhile to consider how to further anonymize behavior across VMs (for obvious reasons it's not a top priority).

As for Whonix, I completely understand it was out of scope for the intention, I was merely using it to see if a template that I had not touched in anyway, running over TOR was still identifiable as me. It was and that's disconcerting.

===================================================================== @adrelanos while I understand appreciate your view of using "free" software I think that's a somewhat unrealist dictum...there's a reason that the Linux Desktop is dead at the Linux Foundation.

I can't get my twenty relatives to use a "free" solution (hell my dozen or so close friends who are all hardcore linux/cs nerds can't find a decent alternative to Zoom, though we may experiment with other non-free systems). I can't fill out a PDF form in Linux (I have to use Adobe via a VM, maybe FoxIt will work), and there are no viable FOSS alternatives to MS Office (I've used LibreOffice and it's corrupted countless documents). I can't get printers and scanners to work except for a very limited subset of them.

To your point, it does come down to purpose...those of us in this community still have to live in the real world and interact with other people/systems. Do I like Zoom? Not for the many reasons you mentioned and more, but does it work? Yes. Do I trust it, no? Which is why I'll download and install it only temporarily in a DVM and have it disappear. This goes into a bigger issue, who is Qubes for? Dissidents, whistleblowers, and people who want or need as much privacy and security as possible still need to fill out documents, edit presentations, and talk to their families. I choose to use Qubes (and do a million other difficult, expensive, tedious, and time consuming things) because I'm trying to maximize my privacy and security, but I recognize that almost all people I know (including people who are incredibly technical and love to do nerdy linux projects for fun) won't do a fraction of what I am doing.

I do try to get others to do things that will maximize their freedom as well, but if you truly want greater adoption the community needs more compatibility with every day products that people need. For 99% or more of people if they can't get their printer to work or when they're told there is no decent way to fill out a PDF form they won't use that platform (i.e. Linux)...and even people I know who maintain a custom Gentoo system still have a dedicated Windows box for certain tasks (mainly gaming and media).

marmarek commented 4 years ago

The next question is even if it's heuristically based, is this something to be concerned about. I'm not sure what or if anything can be done about it, but it is definitely a privacy issue.

I think this heuristic is simply tuned to be more strict than necessary. As I have tried from two different physical machines, with different network connections - so clearly detected different users as the same - wasn't able to identify whether is it the same machine or not. It looks similar to temporary blocking login attempts after too many failed tries, regardless of the source. Does it make password bruteforcing harder? Yes. Does it mean they correlate identities of individual clients? No. Does it sometimes reject legitimate attempts? Yes.

e6lk7dqzm83p commented 4 years ago

@marmarek thank you for that analysis. It's hard for me to judge how big a threat this is with my current time and resources (e.g. one friend suggested I setup a proxy to MITM all packets from the different VMs and analyze them in Wireshark). Ultimately you are all experts in this, I'm a rank amateur so I'll defer to your judgment and expertise.

Thank you all for your work on Qubes...I think it's one of the most novel and amazing things ever built and I appreciate that all of us are safer because of your efforts.

adrelanos commented 4 years ago

This might work better with StandaloneVMs since these don't share a root image. All depends how privacy intrusive locally installed software implements fingerprinting.

Maybe I am asking something impossible / infeasible about using Free Software. But what you're asking me for, it's also impossible for me, I cannot do it. It's not as if I switch that I can flip and it's done. I've already explained technical reasons why this is unrealistic and a battle that cannot be won.

For definition anonymity vs pseudonymity, see: https://www.whonix.org/wiki/DoNot#Confuse_Anonymity_with_Pseudonymity

When you use Tor Browser, ideally you would be anonymous. But there are many issues with that. Possibly it's only pseudonymity. [1]

Once a VM is compromised (meaning malicious code running there, installed by yourself or compromised doesn't matter), anonymity is reduced to pseudonymity. And it opens up further attack vectors of which some are documented here: https://www.whonix.org/wiki/Multiple_Whonix-Workstations

These are known limitations of the project.

Simplified: 1) First there was the Tor network. 2) Then there is Tor Browser. 3) Now there is Whonix which increases certainty that Tor will be actually used. 4) Now you're effectively asking to invent an anti-fingerprinting solution that allows to run arbitrary, untrusted, locally executed programs. Everyone looking the same or new unique identities on demand?

If a web service prohibits and takes action against against anonymous / pseudonymous use (examples here are google and facebook, hard to create pseudonymous accounts there, requires mobile number verification, sometimes ID upload and whatnot), then Whonix can't do anything about that. If you want / must such as service anyhow and want to preserve privacy, well, what can I say, bad luck. Life ain't easy.

This only gets worse for similar services that require locally installed software that does fingerprinting.

Even Tor Browser hasn't solved this issue and while web browsing is a huge and hard to fix area for anti-fingerprinting [1], attempting the same at the operating system level for locally installed software is even harder.

This is similar to VM detection / sandbox detection. Some malware does not want to run inside VMs to avoid getting caught, infecting non-valuable systems and analysis. But malware analysts want to use VMs / sandboxes, therefore they invented anti-vm / anti-sandbox detection. Now, that's an arms race. Malware authors can always invent new methods to detect VMs / sandboxes.

Then it depends on creators of the proprietary software policy. How serious are they about fingerprinting and destroying privacy. A simple solution for them would be to add to their program "can't run inside VMs". What we do then? Add anti-VM detection? Then they could fingerprint other things such as performance. What we do then? Emulate rather than virtualizer to fake a realistic system performance profile? Then they could say "can only run on non-rooted android (some apps already do that), iphone, Windows". What are we going to do then? Emulate these platforms?

All depends how much they care about restrictions vs other priorities. Or they could use DRM...

And we might not be able to beat DRM. Also while I am not a lawyer, this is not legal advice, and not much interested in this subject, circumventing DRM is illegal in some places, see:

https://en.wikipedia.org/wiki/Anti-circumvention

In short: producers of proprietary software have too many options. Not a fight worth taking up.

The reproducible builds initiative will help but ultimately I guess compromised VMs will stay pseudonymous.

See also this article on how Open Source is organized: https://www.whonix.org/wiki/Linux_User_Experience_versus_Commercial_Operating_Systems

Whonix is a research and implementation project. We take available Open Source components, create own components by ourselves, and create a compilation. But what's not happening is creating a massive size project (similar to Xen, VirtualBox, KVM), an emulator that can run arbitrary proprietary software with new identities on demand.

To Reproduce Start a DVM, install Zoom

install Zoom this is one crux.

Get booted from the meeting

We also have to be careful here. If these people don't want you, stay away. For legal and other reasons, we're not working on solution to circumvent access control systems of third parties.


[1] https://trac.torproject.org/projects/tor/query?keywords=~tbb-fingerprinting https://trac.torproject.org/projects/tor/query?keywords=~tbb-linkability

e6lk7dqzm83p commented 4 years ago

I understand WHONIX has a very particular scope and that I was using it outside of scope. The only reason I used a WHONIX system was to have a DVM that was entirely setup by someone else (i.e. I likely wouldn't have been able to mess up that DVM) that used of TOR as opposed to a VPN was nice (but I could have switched to sys-whonix as my NetVM for my DVMs and gotten that result).

Given that Zoom was seemingly able to track me across multiple DVMs with different IP address and underlying distros I opened the issue as it indicated that there could be some sort of security problem (perhaps Zoom was somehow able to punch through the DVMs, maybe there was some unknown persistence between DVMs that hadn't be accounted for). Unfortunately I had no way of evaluating it beyond what I had already done. @marmarek and others seem very comfortable that this is not a security issue but some other heuristic method (I realize I didn't try to see if it would work with my Windows VM which would have been an interesting piece of data) so that's the end of it for me.

Please understand I'm not expecting WHONIX to solve this problem (i.e., number 4 above): I get this it's very far afield from its purpose. I was using WHONIX just for testing/diagnostic purposes to assure myself (and the people responding to the issue) it's not just a user error. My comments were more about my general frustration with some of the FOSS community, I now realize that you thought I was asking something that would be impossible and were responding to that. I'm sorry for my lack of clarity on that and misunderstanding of what your response was saying.

As I said before I'm very grateful for the work you all have done and continue to do and I'm glad that there isn't a security issue!

TNTBOMBOM commented 4 years ago

I just have tested to open 2 zoom apps using these steps for setup:

Result: Working as expected as zoom didnt identify me!

ScreenShot: zoomba

marmarek commented 4 years ago

@TNTBOMBOM the "issue" was about kicking one participant and then trying to join again (from another DispVM), not just using two accounts (or non-registered) clients on the same machine.

SvenSemmler commented 4 years ago

@marmarek the explanation might be as simple as the MAC address. I tested with Bionic, Fedora and Win7 as well as Standalone, Template-based and Disposable. In all cases the MAC address was identical to the one in sys-firewall. And I did apply the random MAC and hostname as described on the Qubes homepage. That makes them random, but the same in all qubes anyway.

Maybe qubes should have randomized but unchanging MAC addresses (changing/random in disposables of course)?

Maybe (hopefully) Whonix already takes care of that? @adrelanos

marmarek commented 4 years ago

This indeed is a plausible explanation. Every single VM in qubes internally use the same MAC, so if they do that based on MAC, indeed every VM (regardless on which machine!) looks the same. If anyone want to verify this hypothesis, it is possible to change MAC of a VM before starting it - using qvm-prefs tool (in case of DispVM, that needs to be done on DispVM-template, like fedora-30-dvm):

qvm-prefs VMNAME mac 00:11:22:33:44:55
SvenSemmler commented 4 years ago

https://zoom.us/privacy Data that our system collects from you: IP address, MAC address, other device ID (UDID), device type, operating system type and version, client version, type of camera, microphone or speakers, connection type, etc.

adrelanos commented 4 years ago

I've started exploring and documented this issue. Please leave feedback. See:

@marmarek the explanation might be as simple as the MAC address. I tested with Bionic, Fedora and Win7 as well as Standalone, Template-based and Disposable. In all cases the MAC address was identical to the one in sys-firewall. And I did apply the random MAC and hostname as described on the Qubes homepage. That makes them random, but the same in all qubes anyway.

Yes, have to distinguish MAC address that goes out to the LAN and MAC address only visible in VM.

Maybe qubes should have randomized but unchanging MAC addresses (changing/random in disposables of course)?

Maybe (hopefully) Whonix already takes care of that? @adrelanos

Using randomized MAC address inside VM, no. And I don't think it should, see: https://www.whonix.org/wiki/Dev/Technical_Introduction#Goals_and_Non-Goals

Random MAC is actually not easy, see: https://www.whonix.org/wiki/MAC_Address#Random_MAC_Addresses

MAC addresses are a rabbit hole, see:

Various pros and cons for either implementation. Therefore best discussed in separate issue.


@e6lk7dqzm83p no worries. I wasn't complaining. Just trying to explain the state of development. And, I am not only explaining it to you but any member of the public reading this. It's a good, important question, that's why I elaborated that much. I appreciate you asking the question. It's interesting. Hence, this discussion and above pages were created.

As I said before I'm very grateful for the work you all have done and continue to do and

Thank you for your appreciation!

marmarek commented 4 years ago

FWIW I have confirmed it is specifically about the MAC address. From the same DisposableVM after changing its MAC address (and removing zoomus.conf that cached it as DeviceID field, as well as other zoom configs) I was able to re-join a meeting without changing anything else (same IP, etc).

e6lk7dqzm83p commented 4 years ago

@marmarek thanks, I didn't realize that there were fixed HW features in each DVM (I suppose that should have been obvious, but never thought about it before now). Would it be possible to somehow scramble/randomize all HW features that could be used to track someone (you had mentioned cameras and other devices above, which I know are used for fingerprinting/tracking by web services)? Probably not a high priority for you guys.

Also, wouldn't using a different physical machine have provided you with a different MAC address? I'm wondering why that wasn't caught (but as you pointed there may be other heuristics they're looking at).

Thanks again everyone!

andrewdavidwong commented 4 years ago

Would it be possible to somehow scramble/randomize all HW features that could be used to track someone (you had mentioned cameras and other devices above, which I know are used for fingerprinting/tracking by web services)?

If this were to be a feature request, I think it should be a feature request for Qubes-Whonix rather than Qubes VMs or DisposableVMs in general, for the reasons given in my last comment above.

adrelanos commented 4 years ago

Would it be possible to somehow scramble/randomize all HW features that could be used to track someone

I've discussed this in my previous 3 posts here.

e6lk7dqzm83p commented 4 years ago

Would it be possible to somehow scramble/randomize all HW features that could be used to track someone

I've discussed this in my previous 3 posts here.

Yes, sorry, I missed the amount of detail you went into. I entered some thoughts on the Whonix page.

@andrewdavidwong I think there are two different types of features as @adrelanos pointed out there are the Deterministic and Non-Deterministic attributes. I agree with you that the Non-Deterministic attributes are probably better left to Whonix, however I'm not sure why the Deterministic ones would be unreasonable to add to DVMs. Specifically I'm thinking about randomization of the hardware profile (vendor IDs, MAC addresses, device types, etc.) I suspect this is much more complicated than I think it is (though I think there are already projects that do this sort of randomization as browser extensions and there is https://github.com/alobbs/macchanger for MAC addresses, though it probably does not address the full range of complexities that can occur with MAC addresses that were mentioned above). Just trying to get your thinking before I create a feature request.

andrewdavidwong commented 4 years ago

I think there are two different types of features as adrelanos pointed out there are the Deterministic and Non-Deterministic attributes. I agree with you that the Non-Deterministic attributes are probably better left to Whonix, however I'm not sure why the Deterministic ones would be unreasonable to add to DVMs. Specifically I'm thinking about randomization of the hardware profile (vendor IDs, MAC addresses, device types, etc.) I suspect this is much more complicated than I think it is (though I think there are already projects that do this sort of randomization as browser extensions and there is https://github.com/alobbs/macchanger for MAC addresses, though it probably does not address the full range of complexities that can occur with MAC addresses that were mentioned above). Just trying to get your thinking before I create a feature request.

It sounds like the randomization of both deterministic and non-deterministic attributes would be a privacy feature, and the development of privacy features should be concentrated in Whonix VMs for the aforementioned reasons. I suppose I don't see why the randomization of deterministic features should be an exception to this.

e6lk7dqzm83p commented 4 years ago

Ok, I don't want to change the purpose/mission of Qubes, but I think for a lot of people having a more private experience in Qubes (particularly DVMs) is a plus. If the consensus is that is not reasonable I'll refrain from making the feature request (or I can make it and see how much interest it garners from people?). Sorry, I'm trying to be a good citizen of the Qubes community.

adrelanos commented 4 years ago

Specifically I'm thinking about randomization of the hardware profile (vendor IDs, MAC addresses, device types, etc.)

Qubes consists of various components. Xen, Linux, Fedora, Debian, Whonix.

See this page to learn more about the organizational structure of Linux (Xen) distributions: https://www.whonix.org/wiki/Linux_User_Experience_versus_Commercial_Operating_Systems

In short: we take existing components, add our own components and create a bundle which is usually called distribution.

This is also a missing feature in the virtualizer, Xen.

This is a difficult to explain issue, difficult to write feature request. A badly written one would not be understood and very most likely go nowhere. Even if written a good one, I think the chance is very low of upstream, Xen being interested.

A question of mine previously remained unanswered: https://lists.xen.org/archives/html/xen-users/2015-11/msg00114.html

Neither Qubes nor Whonix are into the business of forking virtualizers.

This type of request probably boils down to something like "fork Xen" or "create a privacy enabled virtualizer". It's just unrealistic from a project of this size to expect.

If https://github.com/QubesOS/qubes-issues/issues/1850 / https://github.com/QubesOS/qubes-issues/issues/2558 didn't make progress after 4 years (not complaining, if it was easy, I'd done it myself already) - which is about remote fingerprinting - keystroke based fingerprinting, you can imagine the chance and waiting time for a feature request which is about local arbitrary code execution VM fingerprinting.

If you want this to happen then it very most likely won't happen from a feature request. That would just be another ticket somewhere and if you check in 5 years again exactly nothing happened. There's a lot of such already. Go to a computer security related conference, hold a speech about this issue to draw attention. Or go to a place where Xen developers hang out and/or Xen conference and try to explain the problem. Or raise lots of funds to pay people to make it happen.

For example progress was made on "measured boot and boot integrity remote attestation" but that wasn't from feature request but people doing it.

I am not directing this to you personally but rather to anyone interested in the same.

I suspect this is much more complicated than I think it is

Right.

though it probably does not address the full range of complexities that can occur with MAC addresses that were mentioned above)

Correct.

Just trying to get your thinking before I create a feature request.

Good.


Closing comment of this ticket https://github.com/QubesOS/qubes-issues/issues/5764#issuecomment-611955663 is still valid.

Page https://www.whonix.org/wiki/VM_Fingerprinting already links to a lot of tickets, features, initiatives. If all of these are implemented one day, a compromised but still sandboxed application might no longer be able to link to different VMs. They would look all alike.

But that still wouldn't help with proprietary software. Actually different users would then be falsely detected as being the same by the proprietary software which would be interesting to see how they react. They might not care if too few users are affected and/or oppose these measures since it's counter to their goals.