oxen-io / oxen-improvement-proposals

The Loki Improvement Proposal repository
MIT License
12 stars 12 forks source link

Privacy Implications of Replacing the Oxen Privacy Coin in the ONS Registration Process #49

Open venezuela01 opened 1 year ago

venezuela01 commented 1 year ago

Privacy Implications of Replacing the Oxen Privacy Coin in the ONS Registration Process

Introduction

ORC 8 proposes the replacement of the Cryptonote-based Oxen privacy coin with an EVM-based transparent token. The proposition also suggests an optional third-party privacy tool, such as Tornado Cash, to offer an added layer of privacy protection, in case a sensitive operation like ONS registration is involved, thereby granting users the options to prioritize enhanced privacy.

While some believe that this change empowers users to make individual decisions regarding their privacy requirements, concerns have arisen over the potential privacy compromise for the Session messenger network due to the removal of the Oxen privacy coin.

Key Concerns and Findings

Our comprehensive analysis identifies significant risks associated with removing the inherent privacy features of the Oxen coin from the Session ecosystem. Notably:

  1. Users choosing a transparent token for ONS (Session username) registration could face compromised privacy.
  2. Those who prioritize privacy and use tools like Tornado Cash to obscure their EVM wallet may still be at risk of exposure.
  3. There's a potential for backward information leakage from the Session ecosystem to the transparent EVM chain ecosystem, endangering even those wallets which have undergone mixing via Tornado Cash.

Visual Case Studies

To facilitate understanding, we discuss various perspectives accompanied by visual representations:

Case 1: Oxen's Inherent Privacy Framework

The above outlines the ideal privacy architecture of the integrated Oxen-Session ecosystem. Each dotted line in the visual representation indicates a concealed connection between two nodes, thus ensuring a robust privacy framework. However, there are hidden threats not showing in the above graph. In the following section, we will gradually uncover these concealed dangers and discuss their heightened implications when transitioning from a privacy chain to a transparent one.

Case 2: Common Understanding of Using Transparent EVM Tokens Directly

This interpretation portrays a conventional perception regarding the worst-case scenario when transparent tokens are used to directly facilitate the ONS registration process. It suggests that should a user, nonchalant about potential data breaches, opt to utilize a KYC-compliant funding source for ONS registration, they are personally assuming the inherent risks.

Yet, subsequent analyses will elucidate that this representation of Case 2 is somewhat oversimplified. Numerous latent risks are underestimated and await discovery. More concerningly, a subset of users, indifferent to their privacy, may inadvertently endanger connected users who have been meticulous about safeguarding their privacy.

Case 3: Common Assumption of Utilizing Tornado Cash for Augmented Privacy

This model delineates the prevailing belief surrounding tools like Tornado Cash and their purported efficacy in bolstering user privacy. However, this protection may be somewhat weak. In subsequent sections, we will delve deeper into lurking vulnerabilities and challenges.

Case 4: The Potential Threat of Uncovered Session Weakness Compromising Tornado Cash Protection

Previous case studies have assumed the absolute safety of Session's native social graph protection. However, no system is absolute secure all the time. If there exists an unknown weakness in the current implementation of 1-1 chat, closed group, or open group that is exploited by an attacker to extract the social graph, the damage will be magnified by the transparent EVM chain.

Let's consider David, who has carefully safeguarded his wallet using Tornado Eth. If David’s Session social graph is exploited, a state actor might request exchanges to reveal specific customers’ information. This would unveil the identities of wallets such as Alice.Eth, Bob.Eth, and Carol.Eth, which are associated with ?Alice.Session, ?Bob.Session, and ?Caro.Session. These are David’s Session contacts. The next step for the state actor would be to contact Alice, Bob, and Carol using their real-world identities, compelling them to disclose the identity of David.

Here we employ the custom symbol >-- --< to represent the exploitation of a social graph. As a side note, once >-- ?David.Session–< is exposed from his social graph, the information leak can further propagate backward to David’s private wallet ??David.Tornado.Eth. The arrow from >-- ?David.Session–< to ??David.Tornado.Eth is not misplaced. It is deliberately marked to emphasize the counterintuitive and unexpected direction of information leakage from an ONS back to a wallet.

David might have previously considered this wallet safe and could have used it for other activities, assuming privacy. However, he might not have anticipated that:

  1. His social graph could leak.
  2. His contacts who neglect to protect their wallet privacy might be identified by a state agent.
  3. The combined effects of (1) and (2) could indirectly compromise the assumed privacy of his wallet. This would link back from his Session ONS registration and could eventually expose his real-world identity through the real-world identities of his contacts.

While it may appear nitpicky to speculate about a theoretical compromise of the Session social graph, especially when it's designed for protection, it's vital to adhere to the Defense in Depth design principle. If one part of the system fails, we should strive to ensure the overall safety. It's also worth noting that the current ONS design could potentially be vulnerable to a dictionary attack using third-party breached datasets. Such datasets could be leveraged to reconstruct the Session users’ identities if ONS is adopted on a vast scale, even if there are no flaws in Session’s cryptographic implementation. We will delve deeper into this in the following case study.

Case 5: Third-Party Breached Dataset Potentially Compromising Tornado Cash's Privacy Protection through ONS Dictionary Attack

As highlighted in Risk of Dictionary Attacks on ONS, according to List of Data Breaches and Biggest Data Breaches, over 10 billion records have been compromised from more than 360 data breaches up to 2023. This alarming figure exceeds the world's human population.

Countless potential usernames can be extracted from these leaked datasets, facilitating a dictionary attack against Session’s ONS system to infer the identity of a Session account. If a dictionary attack is successfully executed, it could further endanger the privacy of the Eth wallet used to register an ONS.

In the given example, although everyone, including Alice, Bob, Carol, and David, has diligently used Tornado Cash to fund a brand new wallet before registering an ONS, they failed to realize that a dictionary attack could jeopardize their wallets, as the arrows in Alice.LeakedData - - -> ?Alice.Session -> ?Alice.Tornado.Eth indicate. The long dashed line with an arrow between Alice.LeakedData and ?Alice.Session signifies a connection based on a dictionary attack, which carries a probability of success. A significant number of successful attacks can be realized given a large-scale assault. These users have no idea that their wallet privacy had been breached and might continue to overly assume the protection of Tornado Cash, potentially using these wallets for other endeavors.

Case 6: Sophisticated Deanonymization Combining Information from Third-Party Data Leaks and Payment Network Records on a Transparent Chain

The previous case study doesn't tell the full story. In reality, leaked datasets could themselves carry sufficient information to construct a social graph on their own. Concurrently, a payment graph can also be constructed on a transparent chain connecting different wallets. If Session plans to support Eth transactions with an integrated wallet, or if any competitor messenger app decides to integrate a wallet with Eth support, or if, conversely, any third-party wallet app opts to integrate a messenger feature, any of these events could potentially expose social connections through transaction flows on a transparent chain.

Users might share the same wallet private key between compatible apps or transfer funds from one wallet app to another. Both actions effectively construct almost the same transaction-based social graph on one app or another, potentially with sources verified through KYC, indicates by the optional coin mixer wrapped by a square bracket in [.Tornado].Eth. When a unblinded social graph extracted from a leaked dataset is combined with a partially blinded social graph reconstructed from a transaction network, sophisticated Deanonymization techniques can be employed. This is highlighted by the parallel line pairs with identical colors in the graph. These techniques, amplified by a large set of unblinded seed nodes, either from the exchange side, or from the ONS side, could significantly undermine the privacy protection of many Session users with ONS enabled, and further pose a threat to those who use Tornado to protect their wallets.

Note: The solid line connections between *.Session and *[.Tornado].Eth are supposed to be bidirectional, unfortunately there is no way to decorate a line with a bidirectional arrow in GeoGebra.

The consequences should not be underestimated. Without ONS on a transparent chain, Session's net contribution to global privacy is positive. However, with large-scale ONS registration from a transparent chain, privacy will be gradually eroded over time. Ultimately, Session's net contribution to global privacy may regrettably turn negative. This is because information leakage is bidirectional and multidimensional, and ONS registration provides a plethora of resources to connect the dots and reconstruct the full picture, transforming some of the most private nodes into highly vulnerable ones.

Conclusion

The study cases outlined above are practical and should not be neglected. Professional companies and state agents already possess the necessary infrastructure and third-party datasets. All they require is to build plugins to extend their systems to include one more data source, using established algorithms to further expand their information pool.

The cases presented are just a few ways to exploit the system and are far from exhaustive. In the real world, network topologies are highly complex, and any unexpected connection could potentially expose sensitive data. Machine learning-based algorithms can effectively mine information from complex datasets. Ironically, the challenging part might be drawing these graphs and explaining why these exploitations work (actually, that's a bit of an exaggeration). Even with a probability of false positives or mislabeling, these algorithms are powerful enough to challenge Session's design assumptions and its value proposition.

Given the risks highlighted, it's clear that transitioning to an EVM-based transparent token for ONS registration, even when using optional third-party privacy tools, poses significant privacy threats to the Session ecosystem. A thorough examination of ORC-8 is essential to maintain the high privacy standards that Session users expect and deserve.

As demonstrated by this report, the Oxen chain is a fundamental part of the Session ecosystem. If our goal is to maintain the privacy of Session users, ONS registration based on Oxen from a privacy chain offers vastly superior protection than ETH-based registration. If we shift to ETH and abandon Oxen, the advantages of ONS compared to the more widely adopted ENS system are minimal. The ENS system, as demonstrated by the sophisticated attack in case 6, can itself pose a threat to global crypto privacy. An Oxen-based ONS distinguishes us from other systems. We should continue to seek ways to enhance its security and privacy, as seen in proposals like Addressing the Risk of Dictionary Attacks on ONS and Enhancing Privacy Protection in ONS: Session Subaccount as a Unidirectional Privacy Firewall to Protect the Main Account, rather than abandoning our strengths.

Satoshi Nakamoto couldn't foresee the intricate deanonymization methods that would emerge a decade after Bitcoin's creation. For the same reason, the uncertainties surpass our certainties when projecting ten years ahead. When designing a secure and private system, we should try harder to attack our own design, so that we can protect our users harder. One potential enhancement to ORC-8 could be to use an EVM-based token as both a liquidity layer and a branding representative on the surface. Simultaneously, the Oxen-based chain could serve as a foundational privacy firewall at a lower layer, silently safeguard any substantial interoperation with the Session user network. Further research is required for this approach.

venezuela01 commented 10 months ago

See also:

Blockchain is Watching You: Profiling and Deanonymizing Ethereum Users

Practical Deanonymization Attack in Ethereum Based on P2P Network Analysis

Behavior-aware Account De-anonymization on Ethereum Interaction Graph

ETGraph: Insights from Ethereum Transactions and Twitter Data

eth-twitter
venezuela01 commented 9 months ago

Update: I have proof-of-concept code to demonstrate that some of the concerns or premises in the above report are practical. I've also shown some relevant results to @KeeJef and @jagerman.

venezuela01 commented 9 months ago

Another case study worth discussing is debank.com. Debank is a transparent Web3 social platform akin to Twitter, featuring 6 million monthly visits and over 60,000 registered users with their wallets connected. Users on Debank can follow each other as on Twitter, creating a social graph linked to their blockchain wallet addresses across various chains. This user base provides a real-world, transparent dataset for privacy research.

If Debank users on board Session and then purchase Session ONS or other premium features using the same wallet linked to their Debank account, they inadvertently carry their social graph information from Debank to Session. This action potentially exposes the Session network to risks of deanonymization. Merely using optional privacy transactions is not sufficient for a user to protect their privacy. Unless everyone you interact with also obtains privacy protection, the risk of exposure persists and spreads. Privacy considerations carry negative externalities, similar to environmental pollution or infectious diseases. Just as an individual polluting the environment affects others living in it, a user negligent about privacy likewise compromises the privacy of their contacts.

Even if every Session user currently keeps their wallet address isolated, there is no guarantee they won't make mistakes in the future. For instance, a Session user might accidentally use the same wallet address to register a Debank account five years later, after purchasing Session ONS or Session premium feature with that wallet. They might forget that the wallet was intended solely for a privacy messenger app. Once such a mistake is made, their past efforts to protect their privacy would be nullified. We should not design a system that leaves our users exposed to such risk.

venezuela01 commented 9 months ago

Since the team did not comment on this issue three months after it was posted, I raised the issue in the Oxen Community. Below is a copy and paste of the discussion from the Oxen Community for archival purposes:

KeeJef: My view is that ONS is generally used as a discoverability feature rather than a privacy feature, moving to a transparent chain does weaken onchain privacy and its not something that can be 100% corrected by usage of existing Ethereum privacy solutions. However i think this issue is mostly limited to those users who opt into using ONS, I'm not convinced that the social graph leak issues uniquely affect an erc20 token vs Oxen as it currently operates, with very little transaction volume and most users withdrawing direct from exchanges then performing transactions.

venezuela01: I think you make a fair point about "ONS being generally used as a discoverability feature rather than a privacy feature." However, this is a disadvantage compared to some centralized messenger apps, which allow users to temporarily enable or disable username-based discovery and protect user privacy by applying rate limits to average users when searching for usernames. I don't have an easy solution for this issue either, despite proposed some ideas in [1], but I admit those solutions are quite complicated with limited benefits.

I agree that "moving to a transparent chain does weaken on-chain privacy, and it's not something that can be 100% corrected by using existing Ethereum privacy solutions." This opinion should be communicated to other team members, such as Alex Linton, Chris M., or Maxim, to ensure they don't overstate Session's privacy in future community interactions.

I use ONS as an example because it's the only monetization method we've implemented so far; there are no other realized monetization solutions for discussion. However, through a thought experiment, we can generalize the ONS model to any other monetization strategy. My view is that any kind of premium feature will be impacted. This is because, in designing a system where a paid user can temporarily delete their account and later restore it using the same passphrase (or recover it on another device) while retaining access to previously paid premium features, we need to link their payment evidence to their Session ID in some way. In other words, regardless of how we design and implement Session monetization, the users' Session ID will have to be linked to some transaction info. This linkage will inevitably leak the user's metadata if the transaction is made using a transparent token.

[1] https://github.com/oxen-io/oxen-core/issues/1649

alex: While it is fair to raise the issues you have and I encourage the discussion, I really do not think that saying Session will remain private after the transition is overstating anything.

venezuela01: I think this statement is an oversimplification. In my opinion, it represents a shortsighted perspective that overlooks the long-term impact and potential, favoring short-term goals at the expense of long-term privacy.

A more accurate statement would be, 'Unpaid Session users who do not interact with any paid Session users will remain private after the transition.' It's important to note that, according to [1], top 3.5% of Session users are involved in over 50% of the messages. If these core users, who are the most loyal and value privacy the most, decide to pay, their privacy will probably be compromised, which is unfortunate.

[1] https://github.com/oxen-io/oxen-improvement-proposals/issues/60

venezuela01: The main difference between using a transparent chain and a privacy chain for the Session ecosystem is the degree of privacy gained or lost as we scale up. With a transparent chain, the more users utilize the monetization feature, the greater the metadata leakage and statistical patterns persist on the chain. There is an inverse relationship between the growth of the network and the privacy of the network. However, with a privacy chain, the more users that use the network, the better the privacy.

In practice, I have considered voluntarily running some self-transfer Oxen wallet clusters every two minutes and periodically deleting old wallets. This would ensure every Oxen block is non-empty and make it impossible for a state agent to link an on-chain ONS registration to a centralized exchange withdrawal transaction, even if the centralized exchange cooperates with the state agent. This will cost me approximately 788 USD per year at the current price (7203650.03 * 0.1), which is affordable. However, the detailed security analysis of this impact is another substantial topic as it has different implications to different people.

A more interesting design would involve Oxen service node operators sending a minimum value of Oxen to each other when elected as validators. This approach would ensure that every block is non-empty while distributing the fee evenly among all nodes.

venezuela01 commented 9 months ago

https://t.me/Oxen_Community/401030

freQniK | MathNodes:

DEXes aren't even private either. It is easy to follow the swap of funds. At least with OXEN there is an intrinsic privacy layer. Not only that in your recent update you said, and I paraphrase "We are just going to implement SENT first and then worry about privacy with Railgun" Privacy on the back burner now. Here is an example of tracing swaps and order books in Cosmos. This guy Rarma is really skilled at what he does but he will be no match when governments use AI to trace blockchain txs. Also Rarma is an Aussie 😇 https://twitter.com/Rarma_/status/1722820171881185456

KeeJef:

Of course, tracking swaps between accounts is easier on transparent chains, not trying to argue that point, but the ETH ecosystem is still very easy to use without KYCing