monero-project / monero

Monero: the secure, private, untraceable cryptocurrency
https://getmonero.org
Other
8.88k stars 3.09k forks source link

[Discussion] Move to a Fixed Ringsize #4229

Closed SamsungGalaxyPlayer closed 6 years ago

SamsungGalaxyPlayer commented 6 years ago

I would like to receive feedback on moving to a fixed ringsize for the September 2018 (v8) protocol upgrade (hardfork). This will happen regardless of the selected ringsize. The current minimum ringsize is 7.

Scope of Discussion

This discussion includes the impact of setting a mandatory ringsize and support for a mandatory ringsize in wallet software and merchant services (eg: CLI, GUI, MyMonero, and Monero Integrations).

Please keep all comments and discussion related to these topics only, not the broader issues of different ringsizes. However, if you support this proposal under certain situations, such as supporting only if the ringsize is greater than x, you can comment with that information.

Major reasons for the change:

1. Few people use the non-default ringsize. At the time of writing, >96% of transactions in the past month used a ringsize <10 [1]. This is consistent with other findings where the minimum ringsize is used for the vast majority of transactions.

2. Non-default ringsizes reduce entropy and therefore decrease privacy. Ringsize is an important piece of metadata, and having people use different values allows users to more easily be identified.

3. Unusual ringsizes for multiple transactions can be linked. If a user sends multiple transactions with an atypical ringsize, then it is likely that these transactions are associated with each other. If this behavior is not permitted, users will not have their transactions linked this way.

4. Non-default ringsizes complicate research. Having exceptions to the number of decoys used complicates analysis on the effectiveness of Monero's privacy claims. Most analyses to-date assumed that all transactions use the minimum ringsize.

Major reasons against the change

1. A configurable ringsize provides flexibility. If a user would like to send transactions with a different ringsize for some purpose, they are able to do with a configurable ringsize.

2. Increasing ringsize increases privacy. Some people believe that increasing the ringsize increases privacy. While this is strictly true since there are more decoys included in the ring signature, the possibility of leaked metadata typically results in the opposite effect.

3. Support for services using non-default ringsizes. Users may be upset if they are unable to send transactions from outdated wallet software.


Overall, I believe that the pros outweigh the cons. I recommend setting a fixed ringsize in the September protocol upgrade (hardfork) and updating the CLI and GUI to support only a single ringsize.

ArticMine commented 6 years ago

@iamsmooth Actually in the XMO case I do not see churning as equivalent since the point of the large ring size was to have enough pre fork fake inputs to allow for an effective 5-7 ring, by literally brute forcing the selection algorithm. This is actually not necessary in the case of XMV since it supports the mitigations. I do recognize this is an edge case nevertheless it does provide an argument because it was unforeseen. I do agree that in most cases properly timed churning may address the additional privacy issue.

The bad actors argument is of course a very legitimate one; however it also has to be balanced with not allowing process to be so rigid as to effectively negate what may turn out to be a very legitimate proposal that was not made by a "bad actor".

ArticMine commented 6 years ago

A fixed ring size is minimizing the attack surface of "Any data available to an observer about the transactions

... and what was proposed by @dnaleor is to for example reduce this attack surface between 11 and 200 ring sizes from 189 possible ring sizes to 5 possible ring sizes.

It is not the kind of proposal that a "bad actor" would propose because it is very constructive, so the bad actor gain is debatable at best.

I can support a fixed ring size of 11 or @dnaleor's proposal as 11, 22, 44, ... etc. I am not supporting leaving the 1 ring increments since then the "bad actors" would win.

iamsmooth commented 6 years ago

@ArticMine

In the case of using churn to mitigate XMV, the transactions would need to be constructed to ensure that, across the two, they have a minimum number of pre-fork outputs. From the perspective of tracing, one can view the sequence of TX_a(ring size=N)->TX_b(ring size=N)->destination as equivalent to TX_c(ring_size=2N-1)->destination. The reason being that if the attacker manages to trace TX_b (which requires excluding N-1 fake outputs), they can only get to the original source if they also trace TX_a (which requires excluding another N-1 fake outputs). So the resulting number of outputs that need to be excluded is 2N-2, which is the same as what is required to trace TX_c.

To ensure that the combination of TX_a and TX_b have a sufficient number of pre-fork outputs in the XMV case requires a wallet change to implement, but so would (and did) doing the same thing with one larger ring.

As for bad actors, I wasn't actually accusing anyone of being a bad actor in this case. I was suggesting that removing a useful improvement (part of a long process of continuous improvement which has resulted in a much stronger Monero) on the basis of vague last minute objections from non-participants would embolden bad actors in the future.

I agree that @dnaleor's suggestion is certainly not coming a bad actor and in fact he has made that or a very similar suggestion previously in the process.

ArticMine commented 6 years ago

@iamsmooth For XMV what you suggest above works because XMV supports the mitigation that were added after the XMO fork. So one could construct transactions on both chains with common pre fork outputs. XMO on the other hand does not support the mitigations. With XMO, short modifying the wallet code, the only way to generate the required 5-7 pre fork outputs in a post fork XMO spend was to drastically increase the ring size so the the algorithm would chose about 5 - 7 pre fork outputs. These pre fork outputs could then be imputed into the subsequent XMR spend.

iamsmooth commented 6 years ago

@ArticMine

the only way to generate the required 5-7 pre fork outputs in a post fork XMO spend was to drastically increase the ring size so the the algorithm would chose about 5 - 7 pre fork outputs. These pre fork outputs could then be imputed into the subsequent XMR spend

We're veering off topic a bit here, but that doesn't seem like a very good mitigation to me. Over time the necessary ring size would need to be larger and larger and eventually it would be unlikely to ever get 5-7 pre-fork (once the fork is far enough in the past).

For any desired ring size you can construct a chain (as with TX_a->TX_b above) with the same properties across all of the fake outs within the chain. If you are relying on the statistical random selection of the unmodified wallet to give the necessary outputs (though, as above, this seems poor), then multiple chained transactions with the same number of total fake outs would have the (statistically) same total number of pre-fork fake outs.

ArticMine commented 6 years ago

Of course it only really works right after the fork, The practical reality however is that over time the problem will likely become moot due to the price of XMO continuing to fall. So those who used this very crude mitigation at the start simply helped accelerate making the problem become moot.

SarangNoether commented 6 years ago

I am in favor of moving to a fixed ring size. New blackball analysis indicates that deanonymization by effective ring size reduction from known outputs is negligible, even at the current minimum. Most recent research into ring-based attacks has focused on output heuristics, leading me to conclude that maximizing transaction entropy provides better protection overall, as @SamsungGalaxyPlayer brought up early in this conversation.

I will state, however, that heuristic-based analysis of transactions is extremely complex, since it is difficult to determine all the metrics an attacker might develop to determine "how suspicious" a transaction appears.

ArticMine commented 6 years ago

@Wigmnd There is an overall fallacy here in that only the change needs to be justified with detailed research, and the status quo does not. One can equally ask the question where is the research that justifies the status quo in this case?

SamsungGalaxyPlayer commented 6 years ago

@Wigmnd we do not have all the facts available. However, the situation as it currently exists is the following:

  1. There is some evidence that suggests an unusual ringsize is harmful.
  2. Very few people use a non-default ringsize.
  3. There is no evidence to suggest a non-default ringsize is effective under any known use-case.
  4. Surae and Sarang support a fixed ringsize.
  5. The code has already been written for a fixed ringsize.

I believe that while not every point of research has been evaluated, the preponderance of the evidence points towards supporting a fixed ringsize. Since the supporting reasons for a non-fixed ringsize have received little support, I feel more comfortable with a fixed ringsize than a non-fixed ringsize, even if there isn't significant evidence showing the exact implications of non-fixed ringsize harms in every known use-case.

Under every situation that has been brought up for increasing the ringsize above the minimum, churning has been shown to be a more effective solution. We can limit our attack surface to timing, not both timing and ringsize. I think that even without the full impact of this being presented in a research paper, it is clear enough that most developers and the Monero researchers concur that a fixed ringsize is preferable unless evidence to the contrary can be presented.

While we do not have all the evidence in the world to support a fixed ringsize, we have more evidence to support a fixed ringsize than a non-fixed ringsize. Thus, we should move to set a fixed ringsize.

@dnaleor has a middle-ground that may seem like a good compromise, but I generally disagree. While it would limit the attack surface, it will have nearly no impact in my estimation. Since people today rarely use non-default ringsizes, I do not believe that this will change, even with fewer options. It will in effect result in the status-quo. We can always increase the ringsize again in the future if new attacks are discovered. It's not like end-users will magically know what ringsize is best anyway for currently-unknown theoretical attacks.

fluffypony commented 6 years ago

@Wigmnd sorry but the absence of a published paper does not imply that "actual research" has not happened. We know that transaction uniformity is a goal in enhancing Monero's privacy, and that means uniformity in ring size as well. That there isn't a formal paper is evidence of how obvious this is, not of a lack of research.

b-g-goodell commented 6 years ago

I'm going to respond first with my recommendations, and in a moment I'll make another comment responding to specific commenters.

TLDR of my conclusion first: We should use a fixed ring sizes, and this ring size should increase occasionally. This ring size should probably not exceed 25 without a change in our ring signature algorithm. Ring sizes as low as 11 may, in fact, be insufficient to guard against certain statistical methods of linkability for targeted individuals. For efficiency reasons, it is reasonable for Monero to begin using Matthew Green's "How to Squeeze a Crowd" for further blockchain pruning for ring sizes in this range.

All the above recommendations are loosely based on the following idea, with an arbitrary 1% threshold: improving privacy by reducing a the true-positive rate of a single-sample test in a simple linkability analysis to under 1% is wasteful.

These recommendations are based on some catch-and-release-style statistical models of Monero blockchain linkability (see below) laden with assumptions (ugh) together with some (ugh) heuristic arguments. However, we have some independent verification: my results are not dissimilar from the simulation results by sgp, showing marginal gains for large ranges of possible ring sizes. The 1% threshold was selected somewhat arbitrarily based on a "controlled purchases" model of tracking money, under the assumption that this attack would have to be executed sequentially (at the expense of the attacker) to collect a sufficient sample size to obtain practically useful true positive rates.

I do not make complete, formalized arguments here, I am merely presenting the beginning of my work on the matter; either way, the Monero community ought not wait on the entire analysis before making decisions based on some preliminary results.

General stuff first. This is a discussion that normally would not require any input from the community. Why? Ring size is a security parameter, like key length. Larger is generally better for security, at a cost of efficiency. Choosing a fixed security parameter when implementing a scheme is objectively preferable to a non-fixed security parameter, because outputs from the scheme statistically match each other, at least with respect to the security parameter.

This is why encryption standards recommend common/fixed key and ciphertext lengths. Totally contrived example: all the angels in heaven are assigned a serial number based on when they were created by the almighty, and all use that serial number as their ciphertext and key lengths for the heavenly encryption scheme. Lucifer happens to have serial number 666; it's always easy to tell a ciphertext was encrypted by Lucifer.

Fixed ring sizes increase the indistinguishability of two transactions but they also remove an avenue of heuristic linkability. This is so important it cannot be quantified. Cryptography has a long history of people trying to disarm specific heuristics instead of constructing schemes against which heuristics are useless. This removes an avenue of heuristic linking.

So, this brings me to the big first point: fixed ring sizes should be included in the next update, and they should increase at least occasionally in future updates.

When should we increase ring size? Any time we have an opportunity to increase ring size to obtain a non-trivial gain in privacy without increasing the overall weight of the blockchain, we should do so. This way, weight always goes down and privacy always goes up. There are also circumstances where we should increase ring size despite overall weight of the blockchain, but that is a discussion for another day, because no one is proposing it here.

The tricky parts of the question: how to define weight and how to define non-trivial? We've defined weight elsewhere using the total time to download and verify, and we used that to pick our new fee structure for bulletproofs. Based on those approaches, here's a for-instance: we are currently sitting at what looks to be a 80% improvement in Monero's overall speed and efficiency if we take the ring size all the way to 11. This is a five-fold improvement in speed; are the gains we get in ring size non-trivial?

Catch-and-Release Corrupt Exchange Game: Assume we have a blockchain with C outputs on it (C for chain). To assess privacy gains against linkability analyses, I constructed the following two-player game using an atomic output-based cryptocurrency network. Alice (a corrupt exchange working with an evil government) sends a set of B transaction outputs to Bob called the blacklist, (B for blacklist) a better way to think of these are as a list of tagged transactions in a catch-and-release sort of ecological sense. Bob goes about his economic business, presumably churning a bunch, and eventually sends a set of transaction outputs called the deposit back to Alice. Alice outputs a bit, indicating whether Alice thinks any of the outputs in the deposit ``came from'' the outputs in the blacklist tagged outputs.

This is the first step towards making complete, formal treatment of fungibility in cryptocurrencies. In the case that the blacklist tagged list has only one output and Alice wishes to determine whether the output is included in the deposit or not, Alice has only two reasonable hypotheses: H_0, the null hypothesis, is that Bob made no blacklist tagged deposit, and H_1, the alternative, is that Bob made a blacklist tagged deposit. We assume for convenience that B = 1, and we set p = 1/C.

An analysis based on some assumptions: Alice has lots of data available from Bob's deposit to come to a conclusion. For example, she can go through every deposit's transaction history on the Monero blockchain and see how many (if any) of his deposits reference the blacklist deposit, and she can compare the distribution of the occurrences of the blacklist tagged list in the deposit history against H_0. If the blacklist tagged outputs occur too often, Alice can accuse Bob.

I assume each layer of outputs in a transaction's history is filled ring members for ring signatures at the previous layer. Each ring member is poison or not. Rings are selected independently from each other. We assume that the attacker has been executing a sustained attack over time such that, for any new ring signature, the probability of any one ring member being a blacklist tagged output is p = B/C = 1/C.

Under these assumptions and H_0, the number X of poison outputs at layer d in an R-ary tree of ring signatures, each with R ring members, is a Binomial random variable with parameters (R^d, p) where p=B/C. Under H_1, on the other hand, X-1 is a Binomial random variable with parameters (R^d - 1, p). The Neyman-Pearson lemma gives us the best possible statistical test to distinguish these hypotheses (hint: if X = 0, H_1 can always be discarded, and the best possible test is monotonic, so our rejection region for H_0 is all X > X_c for some critical X_c, i.e. we disregard the idea that Bob accidentally included a blacklist tagged output as a ring member totally at random if the single blacklist tagged output occurs too many times). Now, here's the fun part: this statistical test has a well-defined power, which can be computed numerically, and we can try to make this test as low-power as possible. When we do this, we obtain some nasty summations, but we can apply Chebyshev's inequality to obtain a lower bound.

Long story short, an attacker doing a catch-and-release style linkability analysis of Monero transactions on a blockchain of C outputs who has released exactly 1 poison output to catch someone performing d churns with a ring size of R has an approximate true positive probability <= C/R^d.

This is an asymptotic formula justifiable using (1) a binomial model with some specific null and alternative hypotheses, (2) the Neyman-Pearson lemma for finding the best possible statistical test of those hypotheses, and (3) bounding some integrals from below using Chebyshev's inequality. Obviously, if R is too small, this bound is useless, it's bigger than one! This is a consequence of the weakness of Chebyshev's, but that's okay, the inequality still holds.

For a blockchain with 10^6 outputs on it, a ring size 11, and churning d=5 times, the above test has a true positive probability of at most 62% (that ain't good, although it is likely lower due to the weakness of Chebyshev's) but increasing the ring size to 25 brings that true positive probability down to a hair over 1%. To give an idea, a true positive rate better than 50% is actually not super great for us fungibility, which brings me to my second and third major points: increasing ring size past 25 is probably wasteful, and 11 is arguably too low to guard against certain controlled-purchase catch-and-release-style attacks.

Those are unreaslistic assumptions though! Yeah, they are. If Alice changes her technique, or if Bob selects rings in a way that aren't independent, or does something wonky, then the above model breaks down. But it's a quantitative starting point for assessing "when will increasing ring size be a wasteful decision?" to manage against inadvertent linkability information being leaked.

One last addition: Note that the tagged outputs are merely tagged; the terminology "blacklist output" is misleadingly sinister, as some community members have pointed out to me.

fluffypony commented 6 years ago

I apologizes then however this will bring me back to the point that this is using insider knowledge known only to a few and which is not publicly available. If the goal is public consensus you must present those findings to the public and allow time for the public to review them before asking for a pubic consensus. This all has its own issues but I don't want to stray off topic right now.

No, obvious is not the same as insider. Would you consider it "insider knowledge" that √2 is an irrational number? Or would you merely consider that to be something that someone with a sufficient grasp of mathematics would know? I'd argue that the onus is on you to prove you have the background and understanding to present a valid counterpoint, otherwise you're effectively saying "I don't understand this so it shouldn't happen" which is an irrational argument (excuse the pun).

paulshapiro commented 6 years ago

lol insider knowledge this has been discussed on IRC for ages

Gingeropolous commented 6 years ago

IRC == Insider Relay Chat. ( mic drop )

/ end of issue

fluffypony commented 6 years ago

@Wigmnd Nope, you just haven't been paying attention for the LAST TWO YEARS, but everyone else has.

https://monero.stackexchange.com/questions/7233/would-a-fixed-ring-size-improve-anonymity

https://github.com/monero-project/monero/issues/1673#issuecomment-277584232

https://github.com/monero-project/monero/issues/1673#issuecomment-277598381

https://github.com/monero-project/monero/issues/1673#issuecomment-314561088

https://github.com/monero-project/monero/issues/1673#issuecomment-314567593

https://github.com/monero-project/monero/issues/1673#issuecomment-328359308

https://www.reddit.com/r/Monero/comments/8ullip/im_back_with_even_more_monero_subs/e1gsq36/

https://www.reddit.com/r/Monero/comments/92vkdf/july_monthly_report_from_sarang_noether/e38t5je/

https://www.reddit.com/r/Monero/comments/7oy8vx/monero_transactions_are_about_to_get_80_cheaper/dsdmn0u/

https://www.reddit.com/r/Monero/comments/65iund/thoughts_output_selection_and_ring_size_and/

https://www.reddit.com/r/Monero/comments/7y97yo/higher_ringsize_than_the_default_stands_out/duevwc0/

https://www.reddit.com/r/Monero/comments/6zqqvn/is_ability_to_choose_custom_ring_size_a_problem/

https://www.reddit.com/r/Monero/comments/79akfc/who_is_this_person_everyday_he_send_tx_with_41/

https://www.reddit.com/r/Monero/comments/82flll/at_what_block_will_the_fork_happen/dvahsx4/

https://www.reddit.com/r/Monero/comments/6tjy9k/ringct_20_by_sun_et_al_is_this_a_part_of_monero/dllp62y/

https://www.reddit.com/r/Monero/comments/7st3v1/with_bp_incoming_in_september_can_we_start/

b-g-goodell commented 6 years ago

@wigmnd rather than providing a detailed response to each of your comments, I'm going to say two things.

First, I strongly recommend that you read this book here. You have displayed a disconcerting level of misunderstanding of some very basic properties of crypto schemes like statistical indistinguishability that have been known for decades. Don't hop onto a racecar engine discussion forum with only rudimentary understanding of internal combustion and start giving advice to other forum members. To be as polite as possible: it's a bad look, dude.

Second, MRL has public meetings once a week to discuss issues like this, and so does monero-dev. You keep saying basic research has not been done, or that it's based on information not available to the public, etc etc, and yet you have not attended a single meeting, you haven't bothered asking for logs, nothing. You come into this conversation as fresh as possible, demanding a complete re-iteration of over a year's of discussions. If you don't think those discussions are public, it's because you haven't looked for them in monero-dev logs, monero-research-lab logs, nowhere. It's not only lazy, it's offensive. It's like a freshman physics student walking into a PhD level class and demanding definitions of every word, and you are totally unaware that you are asking for years of education to be compressed into a few moments. You then follow these statements up with "my request for basic information isn't unreasonable" and "I don't want to be rude to the professor but I feel like she should walk us through this info from the start, you know?"

If you think you are acting in good faith, please take a step back and re-assess your behavior.

fluffypony commented 6 years ago

@Wigmnd also less sealioning would go a long way to making it look like you're acting in good faith.

SamsungGalaxyPlayer commented 6 years ago

@Wigmnd I have written about my points several times. Would you mind condensing your question down? I have discussed the logic behind my position repeatedly over several comments.

fluffypony commented 6 years ago

@Wigmnd I've pointed to a bunch of places (not IRC) where it's been discussed over two years, and @b-g-goodell has given you a write-up on why its necessary. I would recommend you go through those, and understand all of them, before you respond to this thread again.

SamsungGalaxyPlayer commented 6 years ago

There has been a significant amount of conversation here, and the objection has boiled down to the following:

There is not a significant amount of evidence that supports a fixed ringsize.

However, MRL and most other devs say that:

There is more evidence in favor of a fixed ringsize than a non-fixed ringsize.

Given that there are real attacks possible from the metadata revealed from an unusual ringsize, and that there have been no stated reasons why a non-fixed ringsize is better in any scenario, I am going to close this comment since no new topics are being introduced. I strongly recommend that Monero moves to a fixed ringsize in the September/October v8 network upgrade.

I strongly encourage @Wigmnd and anyone else to come up with new evidence to support their position. They can then open a new issue to recommend it is changed back. At the present, we have more reason to make the change than not make the change for all the scenarios and attacks that @b-g-goodell, @SarangNoether, @iamsmooth, @ArticMine, and I have listed.

@Wigmnd we have walked you through a large number of these scenarios. All of us feel comfortable with them. If you are unable to understand the evidence that we have presented, we encourage you to spend more time learning from the resources that we have linked. It is a difficult issue to understand well. Respectfully, the only point you have brought up is that we have not brought up enough points, which the majority of contributors here disagree with. You did not quantify any of your other scenarios, and there was ample evidence to dispute them.

For these reasons and more this discussion is now closed.