Open MWICA opened 3 years ago
If this does not already exist, would it be a good idea for the tumble script to wait for a random # of the other equal sized utxos to be spent before spending its own? The random # could be anywhere from 0 - [number of participants].
This is always a nuanced and interesting point to discuss, and boils down to JoinMarket's maker/taker model, we have two clearly different behaviors and this can be exploited. There was a similar discussion recently in #864, where you can also find the link to a Belcher gist. In particular, the section about increasing waiting time would help with this issue. You can increase your waiting time much more if you want.
The problem as you presented it is definitely a valid heuristic that an observer may use, we can imagine an extreme where 5 minute later one of the equal outputs is spent again in a CoinJoin. The chance that that was the taker (instead of a maker being picked 2 times in 5 minutes) is not 100% but definitely worth considering.
I don't think your solution is feasible though, what if the other equal outputs are not spent for days/weeks/months for whatever reason? Are we gonna wait weeks before proceeding with the tumbler? It also requires to track many more UTXOs that would otherwise be irrelevant to us, i.e., it seems complicated. I think increasing the waiting time on your own basically does the same. Imagine if you wait 1 day before spending the equal output, and imagine that you are still the first spending. The above heuristic is much weaker now, as there are good chances that some maker will be picked again in 1 day timeframe.
Similar to the conclusion in #864, I think the ultimate answer given current JoinMarket model is to do multiple CoinJoin and if possible to mix roles, so that all these heuristic are hopefully severely injured along the way.
Yes this could be a concern. Thanks for opening the issue.
As you said it really depends on how many other people are using joinmarket at the same time. A good value for default wait time depends on that, in low-activity times the default wait time might be too short, while the opposite in high-activity times.
Waiting for other UTXOs to be spent first is a good idea. It's not that complicated, it could be done by polling gettxout
. Perhaps the tumbler could for it for each coinjoin with a coin flip probability of 10% or something like that. That could be a good way to solve the problem of "the best wait time depends on how many other people use joinmarket" because it will automatically re-adjust based on how many other people use joinmarket. We have to make sure we don't end up in a deadlock where two tumblers each wait for the other one to spend UTXOs first, but the random coin flip could help there, or they could both be waiting for a third tumbler to come along.
If I'm correct tumbles can be undone by just simply following the first spent (or sometimes second) common sized utxo. From one test I saw that it was more than 8 hours but less than 15 hours before the first utxo was spent that did not belong to mine, and no other utxos were spent yet. It might even be days before the other utxos are spent. I am doing another full tumble on chain as a test with the default settings to mimic a user. I can report later on when it has completed with the results.
If untractability through tumbles is reliant on breaking this heuristic of [following common sized utxo first spent] then joinmarket cannot be considered safe for mixing bitcoin. In theory and in an ideal world there would be alot of activity which might break this heuristic, but the real world is different. There is little onchain traffic, and even much smaller JM activity.
If I am correct, how can JM be considered safe for users? Especially with the default settings. Right now in the tumble scheduler most delays are less than an hour.
Would increasing the time delay drastically (to hopefully not be the first spent common sized utxo) even help with this?
If so then why is the default time not increased and why are users not warned about this?
Outside of the default time delay settings, users can increase the time by manually modifying the tumble schedule file. Or by randomly stopping the tumble script halfway and continuing hours or even a day later. But if this is a good idea it should be default behavior.
Would acting as a maker before, after or even in middle of a Tumble fix these issues? By having others spend our common utxo output first before us?
Does being a Maker leak our utxos to others?
If I'm correct tumbles can be undone by just simply following the first spent (or sometimes second) common sized utxo
It's not that simple. It's all probabilistic, every time you take that kind of guess you lower the overall likelihood of your model. The ambiguity piles up at every coinjoin and you have no way of know if your guesses were correct or not, I don't think you can disregard that (in theory, in practice bad user practices, sybil, solving subset sum, etc. may of course hugely impact this).
From one test I saw that it was more than 8 hours but less than 15 hours before the first utxo was spent that did not belong to mine, and no other utxos were spent yet
If you want you can look at the CoinJoin directly on the blockchain, so you don't have to test doing actual CoinJoin yourself. You can use the snicker-finder tool in JoinMarket itself or filter the blockchain transaction yourself in another way (I have a toy repo to do that and Kristapsk has one too IIRC). This way you can calculate statistics over a much large number of CoinJoins.
Also, you can create a lot of tumble schedules without running them, just to see how JoinMarket creates them.
If untractability through tumbles is reliant on breaking this heuristic of [following common sized utxo first spent] then joinmarket cannot be considered safe for mixing bitcoin. In theory and in an ideal world there would be alot of activity which might break this heuristic, but the real world is different. There is little onchain traffic, and even much smaller JM activity.
Activity is low but relatively so, there are actually thousands of bitcoin of volume each month (splitted over hundreds of CoinJoins, ~20-40 per day) and JoinMarket is likely the CoinJoin implementation with the largest volume (hard to say exactly due to false positives but it surely is in the top 2 of the open source CoinJoin implementation that I know of).
That being said, I do agree that 1h average default is low. Though, I don't really have a magical better number to propose.
If I am correct, how can JM be considered safe for users? Especially with the default settings. Right now in the tumble scheduler most delays are less than an hour.
JoinMarket is much more than the tumbler alone and, as said above, I don't think this break JM as easily as you think. However, it is true that the tumbler is generally advertised as the strongest JoinMarket's privacy tool, so we should be extra careful about what is being promised.
I think the tumbler can be considered much better than nothing and probably better than many other CoinJoin technique and implementations out there. It's always about trade-offs.
Would increasing the time delay drastically (to hopefully not be the first spent common sized utxo) even help with this?
IMHO for sure and it's a no brainer if you are worried about this stuff, as you should be if your threat model is sophisticated enough.
If so then why is the default time not increased and why are users not warned about this?
I cannot speak for others, but it's a more nuanced topic that what you depict and I've heard JM developers talking about this stuff since years. Anyhow, I'm personally always in favor of more documentation.
Would acting as a maker before, after or even in middle of a Tumble fix these issues? By having others spend our common utxo output first before us?
Mixing roles is always good for privacy, and it's a no brainer if you are worried about this stuff.
Does being a Maker leak our utxos to others?
You reveal your UTXOs and input-output linkages to the taker (since he acts as the coordinator), but only to him.
A good read that includes all this stuff (and more) is Waxwing blog post, in particular the Recommendations for users
section at the end (though, if you are interested in this stuff there is no reason not to read it all).
The default delay time is very insecure for anonymity. One can increase their delay time yes, but most users use the default. Secure settings should be by default for a wallet that people are using with expectation of security (by security I am meaning security of someones privacy). With the default settings one will almost always be the first utxo to be spent. Even if not the first one all the time, even if just the second one spent it still creates a small list of suspects. The defaults on a tumble schedule I generated with the tumbler.py script mostly were under an hour. The default should be not an average time of delay, but a random delay time for between 1-12 hours in my opinion.
SUCCESSFUL TUMBLE DE-MIXING USING ONLY A BLOCK EXPLORER
I will refer to trying to trace a transaction through joinmarket as "forward tracing", the attempt to follow it from the source to the destination (pre-mix to post-mix). And the attempt to trace something from destination to the origninal source (post-mix to pre-mix) as "back tracing".
BACK TRACING (SUCCESSFUL)
I have successfully backtraced, unraveled a post-tumbled UTXO to the original deposit into the joinmarket mixer. From the first test so far. I have only done one test but this first test was successful. The tumbled coin was backtraced to the original post-mix utxo only by observing a block explorer. I did this by following the freshest coinjoin utxo backwards through the chain of coinjoin walls starting with the post-mix coin, until hitting the deposit (see # 2 below). And yes this was even with longer delays than the default tumbler schedule.
I have the post-mix utxo to prove this if anyone does not believe me and wants to replicate this claim. But I am uncertain if I want to post it publicly for privacy reasons.
@chris-belcher @AdamISZ
FORWARD TRACING (Still testing, what originally prompted this issue opened up). Following the commonly sized coinjoin output that is spent first.
Some additional notes:
(related to forward tracing) the sendpayment.py does not conceal the destination when paying to a non joinmarket destination. The destination output can easily be determined by seeing which utxo does not go back into a coinjoin. But the sendpayment.py I believe can conceal the source, of which joinmarket user it came from. This can be used to potentially identify when a tumble exits from joinmarket into the destination.
(relating to back tracing) When following the freshest utxo backwards, the one that is the inital deposit can be determined by seeing that it had no other coinjoins before it. The initial deposit is not always the freshest UTXO.
Summary: forward tracing is only theory now. But from one and the first test I was able to successfully backtrace.
An addition to my last comment. While not being the early spender of a common sized coinjoin output might protect from forward tracing, at this time I do not think that such would make a different with back tracing.
And yes this was even with longer delays than the default tumbler schedule.
Grrr so aggravating. I modified my tumbler schedule to use much higher delay time vs the original schedule file. The original schedules were using most of the time less than an hour, sometimes even less than 5 minutes. But the tumbler overwrote my modified schedule file with another insecure very short time delay one. Maybe I forgot to use the --reschedule flag after modifying the file. As a result in my test only one coinjoin in the tumble was delayed (by at least an hour) as a result of it getting paused asking for a bitcoin address. So back tracing is confirmed with the default joinmarket settings. I plan to do another test soon but with a modified schedule of higher delays, or I'll see if I can find one of my other tumbles that used a modified schedule to see if the same method of back tracing is successful on those as well.
Why does backward tracing work but forward tracing not work in your tests?
I modified my tumbler schedule to use much higher delay time vs the original schedule file.
Two things on this: to use longer delays, read the --help
and modify the corresponding argument (-l
). We don't expect people will generally want to edit schedule files, but:
... second, you can use sendpayment.py with the -S
flag to choose your own custom schedule (so you could take the output of tumbler initially, modify it and then write it to any file and specify it that way, or just write your own)
The default should be not an average time of delay, but a random delay time for between 1-12 hours in my opinion.
As per the help mentioned above, it is indeed random.
--reschedule
There is no reschedule flag. I guess you meant --restart - and yes that's a good point. If you just run tumbler.py
with the original arguments, then it'll just ignore what's in the default schedule file TUMBLER.schedule
. So it just comes back to changing the arguments as per the above.
Since there seems to be renewed interest in this topic, maybe this will be helpful to somebody. It's a list of JM like CoinJoin transaction in the last 4k blocks, from block 690788 to 694787 included (I can't go much further because the node I'm using is pruned).
The file is a (messy) Pandas dataframe in feather format with all the transactions (previous output information from getblock
verbosity 3 included).
The transactions are stripped of stuff like scripts and witnesses, but they shouldn't be useful for this analysis.
Thanks @PulpCattel. We really need more data on this issue I think. It would be good to figure out how long coinjoin outputs remain unspent for and therefore can you use them to unmix a tumbler run, and also figure out if someone can repeat OPs analysis with unmixing tumblers this way.
It seems likely that fidelity bonds will have changed things significantly. Fidelity bonds result in a small number of makers who have valuable bonds being chosen for coinjoins much more often, which means their coinjoin outputs are more likely to be spent sooner. So going back even 4k blocks might not give us accurate information for the future because most of those 4k blocks will be before fidelity bonds.
Yes that's true, it can be interesting to see the difference though, e.g., see if it changes pre and post fidelity bonds, see what changes, how quickly it changes, etc. And of course, knowing if something was going badly in the past can't hurt.
There is no reschedule flag. I guess you meant --restart -
Yes my error. I meant --restart.
Basically I think this is an issue of when a tumble is ran with too short of a time delay between coinjoins. The reasons for back tracing and forward tracing, while separate are related to the same thing. Not giving enough time for other equal sized outputs to be potentially spent first (forward tracing), and not giving enough time for inputs of other makers to be newer than your utxo from the last coinjoin in the tumble (back tracing).
Proposed solutions/enhancements:
I think that last issue related to what I call back tracing, can be fixed by having the client choose at least a couple utxos from the makers which are not older than the utxo of the last coinjoin.
And I think long time delays will fix forward-tracing, and maybe even back-tracing possibly.
I am going to also test with the longer custom set delays in my schedule to see if that breaks such timing heuristics.
Test with default generated schedule timing Back-tracing [Successful] Forward-tracing [Checking]
Test with long time delays Back-tracing [Test planned] Forward-tracing [Test planned]
I completed a tumble with the default short time delay, and just need to check my forward tracing theory when I have time later on. Tomorrow I plan to test with long time delay.
Random idea on potential improvement to JM. While breaking timing heuristic I had a thought. I am still thinking about this. What if time delay is just done away with, and instead the taker that is tumbling selects at least some or all (random count?) of the same makers and their corresponding common sized utxos from the last coinjoin each time, maybe in-conjunction with newly selected makers each round? I think this would break the forward and backwards time heuristic.
Would this be possible while still preventing pre-mix and post-mix linking of correlating balance entering into joinmarket vs. leaving joinmarket? Maybe there might have to be a time delay between each balance fragment (am I making sense?).
Update
Test with default generated schedule timing Back-tracing [Successful] Forward-tracing [Unsuccessful]
Test with long time delays Back-tracing [Testing] Forward-tracing [Testing]
Notes: Back-tracing with default generated schedule is 100% successful in the one test I did. I have the pre-mix and post-mix addresses to prove this if it is needed. Forward tracing was not successful in this test, but in times of low JM activity it might be, with such short time delays. I am now testing both back-tracing and forward-tracing with the schedule file modified for longer time delays to see if this fixes the back-tracing issue or not. Even if not, my potential solution described above might be the answer. I should have results before Thursday, it is now Monday. @chris-belcher sorry to tag you directly, but I think that this is an issue with the tumbler not being safe at the moment with the default settings, as it is vulnerable to back-tracing based on my 1 out of 1 tests. Maybe another tumble run with default settings might be immune, but just from this one test I did, it strongly looks like this could be re-produced. I might PGP encrypt and email you the pre and post mix addresses and txids. I am testing now with longer time delays. I know you are busy, so if I should private message it to someone else I can. Thank you
Update
Test with default generated schedule timing Back-tracing [Successful] Forward-tracing [Unsuccessful]
Test with long time delays Back-tracing [Unsuccessful]* Forward-tracing [Unsuccessful]
*Unlike the default time delays, in the longer time delays back-tracing is not completely successful. In my test with longer time delay the actual utxo I owned was a few times the freshest utxo in the inputs of the coinjoin. But the other times it was not the freshest utxo, although it was the second sometimes.
I only did one test for both default settings and modified settings with longer delays. Conclusion from these two tests, the default time delay setting in the tumbler is NOT SAFE. In the default setting the tumbler is vulnerable to back-tracing. Increasing the time of delays will make the assumption/probability of suspected pre-mix utxos degrade, the greater the random time delays are, the more back-tracing becomes unreliable (as long as there is enough joinmarket activity to make selected maker utxos fresher than yours). My overall test of longer time delays did a full tumble in just under a day and a half.
Proposed change to joinmarket tumbler
As an example there is a coinjoin with 5-7 outputs that have a uniform output value of 0.123
Can it be assumed that the first of these outputs to be spent in another coinjoin is the one belonging to the taker running the tumble script? It might be longer in most cases before the other outputs which belongs to the makers combine these same outputs in another coinjoin. Especially if joinmarket activity is not that high.