Duplication of offers from makers. Many makers one wallet

JoinMarket-Org / joinmarket

CoinJoin implementation with incentive structure to convince people to take part

398 stars 119 forks source link

Duplication of offers from makers. Many makers one wallet #693

Open chris-belcher opened 7 years ago

chris-belcher commented 7 years ago

Sometimes on the orderbook you see a few different makers all with the same minsize and maxsize. What's probably happening is somebody is running more than one instance of yield generator from the same joinmarket wallet. This increases their profits slightly at the expense of de-legitimizing the system because if a taker coinjoins with all those makers who are controlled by the same person, they can be spied on.

The taker code assumes every IRC nick is a different entity so treats them separately. This is a mistake because IRC nicks are free to produce.

chris-belcher commented 7 years ago

Wrote up an idea here for solving this using sorted merkle trees https://gist.github.com/chris-belcher/eb9abe417d74a7b5f20aabe6bff10de0

chris-belcher commented 7 years ago

@adlai and I had a conversation in IRC earlier, we realised that the maker doesn't need to commit to his UTXO-count in public, they can reveal it later when they send their proof privately to the taker. So this particular privacy problem is gone.

Edited the document to reflect this.

chris-belcher commented 7 years ago

There might be a scale problem for when the honest-makers-replaying-proofs-of-fraud. Only one honest maker is necessary but what could happen is every maker uploads all that data each time, which seems wasteful.

AdamISZ commented 7 years ago

(Bit of a side-note but worth saying:)

I think it's important to distinguish two different behaviours here, and two (at least) different motivations:

one wallet, many bots - an attempt to gain unfair economic advantage, follows the protocol normally (or tries to - requires some code tweaks, can and usually will result in privacy degradation even if done right)
one or zero wallets, many bots - an attempt to create a DOS effect on takers (it seems unclear whether it could also aid snooping) by simply refusing to continue the protocol after receipt of !auth.

This issue is about 1/ (I'm not 100% sure, but I don't think it'll be easy for it to address 2/ ? at least, not as described in the gist); it's worth noting that it was actually 2/ that was causing a mini-crisis for JM in the immediate aftermath of the 0.2 protocol update, and was only really resolved in the 0.2.2 update that allowed to fallback to less than the initially requested makers.

I've read through the gist now, thanks for it, I'll doubtless have more questions about it later.

AdamISZ commented 7 years ago

Looking at someone else's thoughts on reddit made me remember writing this gist, see the third bullet point in that section, for convenience here:

Taker chooses 5 of the 10, with additional check: for any utxo that is seen to be in more than one received list, he bans (adds to ignored_makers) both of those Makers. Note that this is a disincentive to publish multiple orders, but would not totally prevent such a Maker from publishing more than one offer on the same utxo set; he would need to do some fairly sophisticated coding though, to create the right kind of non-determinism in his selection. In the limit, if he codes it to completely partition the utxo set between different offers, then this is back to an honest protocol execution anyway.

This is part of @adlai 's taker sophistication approach; on the surface it seems like a cool amelioration of the problem, but it does have a pretty serious fly-in-the-ointment: any system that kicks two bots that duplicate utxos has to be aware that an adversary could use this to kick the competition if he knows their utxos. This could probably be addressed with signing, but has to be done carefully.

Thought it was worth adding to the thread as another idea in case people weren't aware.

AdamISZ commented 7 years ago

Hmm looking at that idea again with fresh eyes, I see (1) a really cool feature of it, but (2) a reason why it's probably just not going to work against a sophisticated attacker.

We can leverage the non-repudiability of signatures: suppose that each individual utxo the maker offers is signed by the nick of the bot (just as all IRC messages are currently signed). Then if bot1 and bot2 both offer, and sign, Utxo U1, then the Taker could prove this misbehaviour; for example, it could send privmsg !ban <bot1nick> <bot2nick> <proof of misbehaviour = sig(bot1nick, U1+takernick), sig(bot2nick, U1+takernick)> to everyone else in the pit, leaving the misbehaving bots unaware they've been banned.
Unfortunately it seems pretty clear that with a modicum of coding, this whole idea doesn't really work - because if someone codes two Maker bots that talk to each other, they can always ensure they don't conflict on the utxos they offer in 1 transaction. That's not the same as them partitioning the utxo sets they offer, they just dynamically choose different subsets in each single transaction.

So unless I missed something, this idea is not good enough to fend off reasonably motivated economic attackers.

chris-belcher commented 7 years ago

Sending to everyone else in the pit also wont work because the adversary can connect with another bot of theirs that never says anything.

AdamISZ commented 7 years ago

Sending to everyone else in the pit also wont work because the adversary can connect with another bot of theirs that never says anything.

Yeah, lol, I was getting overexcited there! Still that transferrability property is something to consider if it's set up right (crypto-wise), even though of course you don't get that extra "in secret" feature which I was erroneously imagining.

Further thoughts along these lines, which I still think are going nowhere but worth mentioning, would be to have some kind of "guard" bots that probed makers' utxos and then did the public reporting of bad behaviour, with the obvious and slightly amusing problem that this behaviour was made deliberately expensive by our solution to #156 . With a trust point you can avoid that cost, but then, you can do pretty much anything with a trust point. With that idea thrown out, I still don't have a good idea here.

AdamISZ commented 7 years ago

PODLE for committing to Maker utxos

I'm not sure why I initially rejected this concept, but thinking about the problem again today, and remembering this reddit post made me re-examine it. It may be a good idea. Further chat on IRC brought up a tweak from @chris-belcher that may make it more practical.

Outline of idea.

The germ of the thinking is something like this: we need the maker to really publish their full list of utxos to avoid duplicates, but they need to be private -> note this sentence is not 'true' but a hypothetical assertion.

We already have a primitive that provides the properties of a commitment (hiding, binding) but with the additional feature of uniqueness - the PoDLE as currently implemented for takers for rate-limiting. Note: in rate-limiting mode it's nice that we can use multiple NUMS base points, and hence different PoDLE values (or H(P2) values); see bitcoin/podle.py and the config var taker_utxo_retries. More on this aspect in a moment.

So an outline of a simple idea would be:

Taker -> !orderbook Maker1 -> !reloffer ... (etc) ... !hp2 utxo11 !hp2 utxo21 .. !hp2 utxoN1 Maker2 -> !absoffer ... (etc) ... !hp2 utxo21 !hp2 utxo22 .. !hp2 utxoN2 etc.

The !hp2 values would be generated similarly to how they are currently generated by Takers for each utxo/pubkey, but with only a single globally agreed NUMS point (such as index 0).

Taker needs to maintain a database of all !hp2 values he sees (~~at least, for some time~~ in this particular proposal, only per-transaction). The second time he sees the same !hp2 value from a Maker, he bans that Maker.

Then, in the !ioauth message, the Maker will need to provide valid podle reveals/signatures (P, P2, s, e - note he is already providing the utxo itself) corresponding to each utxo he chooses to use in the transaction. Taker verifies that the podles are valid, and that they correspond to a subset of the hp2 values provided in the Maker's offer message.

In the simplest analysis, there is no possibility for a rogue maker to cause banning of another - because he cannot predict podle values in advance, he cannot pre-broadcast in such a way as to cause the honest maker to be banned.

However, this analysis is wrong in practice: since Makers can and do re-connect regularly, and are offering utxos for long periods, they must re-announce the same podle values under new nicks(identities), which according to this simplistic rule, is banned behaviour.

Correction due to @chris-belcher : use a Taker-generated nonce to tweak the hp2 value. Algebraically, I'd imagine it working like this:

Taker: !orderbook [16 byte nonce]

Maker: calculate H(P2) using the procedure here, but with the tweak that the nonce is fixed in the hash:

hp2 value = H(P2||taker-nonce)

(I think this is the right way to do it; agree?)

Thus, the podle values are deterministic from this Taker's point of view, so repeated utxos are disallowed, but repetitions for different transactions are allowed, unless the Taker deliberately re-uses a nonce, which serves no purpose as he is only attempting to enforce honesty on his counterparties.

Also note that since !orderbook is a 'global' command (it takes data from all active Makers), there is no sneaky way to somehow 'hope' that you're holding some information back as a Maker. If you deliberately only reveal some subset of your utxos in your offer, you can only get a transaction based on that set. Partitioning the set into different bots is fine. The core concept is to unambiguously fix the utxo set on offer in advance, without revealing it.

Problems

Scale and scalability.

One main reason such ideas don't immediately crop up is the bloating of message passing data. Here we could imagine ~ 30 utxos meaning 30 x 64 hex chars or 30 x 32 if we truncate hp2 values in half (should be fine security-wise, 16 byte entropy) being broadcast in the offer message, and by all makers. Then, in addition, we have a large chunk of 100-200 bytes per utxo-used-in-transaction added to the !ioauth message. Note that this is a scale problem; re: scalability this data is linear in the size of utxos, so at least we don't have a blow up. But it's possible that some improvement on the #650 front is realistically necessary before implementing this, if it is indeed workable.

Privacy

It's probably not the case that adding these podle-style commitments to the offer message adds any important privacy leak. Note that the offer publication is before the Taker-provided rate-limiting commitment in !fill so Takers can read them whenever they like; they might leak utxo number data (padding is one possible improvement on that), but repetition is probably not an issue assuming Takers do not maliciously repeat nonces (this can be banned I think). In the worst case some kind of data leak could help takers identify bots cross-nick, but even if that's true note we in the past had bots using fixed nicks. At the moment I'm not seeing this area being an issue, probably.

Public vs private?

The nonce idea (which as a reminder, is a solution to the problem that if we banned repetitions without this randomizer, then re-connecting bots would be banned - same utxos) works in the interactive scenario. If bots are still to publish their offers to the public channel, for passive participants (e.g. long running Takers), then it precludes them publishing !hp2 values according to this scheme. One approach is just to abandon public announcement. I'm not clear on this yet.

This was written as a comment rather than a gist because I'd like to canvas opinion about potential weaknesses before going further.

chris-belcher commented 7 years ago

Our IRC log of this conversation.

\<waxwing> was just thinking again about possible #693 solutions, copy-pasting here in case anyone has any thoughts on it (693 is the "one wallet, duplicate bots" problem) \<waxwing> \<waxwing> it's almost like, what you want is a crypto primitive that creates an object O that preserves privacy, but when you combine O1 and O2, if one of the hidden components inside them is equal, it triggers an event, such as changing the result to zero, or results in some other kind of failure \<waxwing> \<waxwing> another thought; not scalable but otherwise makes sense? each maker publishes !hp2 (u1) !hp2 (u2) .. (all his utxos) along with his offer. \<waxwing> \<waxwing> where here hp2s are still podles, but now only one globally agreed basepoint (no 3 tries etc.) \<waxwing> \<waxwing> keeps privacy, but now cannot re-use across different bots. \<waxwing> \<waxwing> as new txs are created, publishes new podles \<waxwing> \<waxwing> this isn't scalable in the publish (e.g. 50 utxos) and is also a bit unwieldy in the protocol, as would have to send the taker podle signatures on each used utxos. \<waxwing> \<waxwing> is there something else wrong here apart from scalability though? \<belcher> waxwing so spent utxo podles cant be revoked? \<waxwing> i was thinking it would make sense to ban only the second publisher of any one podle, because otherwise a DOS maker can just republish the previous maker to get him banned, based on the idea that podles are not predictable in advance this makes sense, BUT: \<waxwing> the problem there is, if you disconnect and come back on with a different nick, you "republish". \<waxwing> belcher: yeah good point that'd have to function like !cancel i guess. \<belcher> if you have the cancel, why is it not scalable? it still just announces O(#utxos) \<waxwing> it's a lot of publishing going on \<belcher> takers download O(#all maker's utxos) data \<belcher> an alternative to cancel could be timeouts \<waxwing> ok yeah i mean, it's not bad scaling, my mistake using that word; it's more 'scale' than 'scaling' \<belcher> so a disconnected and rejoining bot can wait for the timeout and reannounce... BUT a malicious DOSer might reannounce first which would be bad \<belcher> waxwing could the commitment somehow include the time? so the commitments become invalid after some time \<waxwing> yeah it seems to me quite a lot of this category of ideas suffers from this kind of "kick off other people by behaving in a way that would be bad if you were them" :) \<waxwing> belcher: i would think so yeah \<waxwing> we could operate in time slices, one base point per time slice \<belcher> ah yes because you can generate new points in a deterministic sequence \<waxwing> global time is left as an exercise to the reader :) \<belcher> global time just comes from the blockchain \<waxwing> oh yeah, i knew bitcoin was useful for something :) \<waxwing> codebase has 256 pre-set NUMS points already. not that it matters really. \<belcher> so the only downside is the data..? we've already seen the 'announcing lots of offers' method for forcing takers to download lots of data \<waxwing> i don't know maybe; i'm still not clear whether time can solve the problem of republishing being OK for you, but not for an attacker. \<waxwing> i was wondering whether some kind of signing could solve that problem. \<belcher> you need the private key to create the podle i believe \<waxwing> also i'm worried that there are privacy implications, simple example being that when you re-join the pit, you no longer have anonymity wrt your previous connection \<belcher> my thinking was the DOSer can stop you if you disconnect/reconnect and they announce your PODLEs first, but with the timeout they only stop you for some time rather than forever \<waxwing> yeah but if an attack is only to re-publish the hp2 itself, there is no requirement for privkey. it's about whether we ban at the republish step itself. \<waxwing> belcher: yeah, gotcha \<waxwing> well, hmm if you rejoin in a different time slice, maybe there's still some delink of identity, but it rather depends on time granularity too \<belcher> although you lose privacy between connections, any spy still only learns your commitments rather than any utxos \<waxwing> yes, absolutely, i was worried that it kind of devolves to the "fixed nick" case, and that number of utxos also leaks. to be fair, people used to actually use fixed nicks anyway. \<waxwing> i mean it doesn't literally devolve all the way to that, but maybe some of the way. \<belcher> i cant think why leaking #utxos is a bad thing \<waxwing> what would of course be nice is to somehow "wrap up" the long list of podles, eg a tree, but when you go down that road it's too easy to mess around faking or tweaking the list. i guess. \<belcher> one thing is all our talking about first and second announcements happen each time a taker says !orderbook, every time that happens the true maker and the DOSer will be in a race to reply to the taker \<waxwing> ah hmm yes you're right, it's really more about the response-to-orderbook, not so much the public channel \<belcher> so the issue with podles is they can be replayed \<belcher> how about challenge-response, a taker says !orderbook [nonce] and each maker replies with a podle that includes the nonce and therefore cant be replayed \<waxwing> well in general it isn't an issue when they function as commitments. but if we take banning action on them, before opening, then yeah it's an issue. \<waxwing> yeah something like that could work. i was thinking more about signing, but that's good - it takes advantage of the fact that we're already in an interactive scenario. \<belcher> on the other hand.. it means it can only work with taker's saying !orderbook and not with just annoucing to the public channel \<waxwing> so to be clear, we're ignoring the public channel messages here? \<waxwing> snap :) \<belcher> so long-running takers would have to say !orderbook every so often, which is bad for scale/amount of data transmitted \<belcher> so maybe challenge-response for !orderbook combined with time-ticking-podles for public channel announcements \<waxwing> i'm not clear whether that's an issue per se. consider tumbler - it only needs to update its orderbook when it does a new tx, not very common. \<waxwing> just basically tumbler acting as a sequence of sendpayments would. \<belcher> thats true, and the same is true for bitcoin-qt-integration where it could say !orderbook before every transaction \<belcher> waxwing remember makers have an incentive to keep the number of their utxos low because it costs miner fees to spend more of them \<waxwing> hmm was i overcomplicating things, could we delay ban until commitment opening? \<belcher> how does that work? \<waxwing> no, sorry, confused \<waxwing> i forgot which attack we're defending against :) \<belcher> but the data usage thing is an issue as we saw with the publish-lots-of-offers thing that AlexCato was fixing \<waxwing> yes, i think it's a fairly big issue without a better message channel. \<waxwing> we can truncate hp2s to 16 bytes, maybe a bit less with encoding. \<belcher> the issue with that fix of limiting offers-per-nick, but a DOSer can connect with many nicks \<waxwing> sorry 16 bytes = 32 chars hex i meant \<belcher> how was the too-many-offer thing actually a DOS? \<belcher> was it any more than just annoying.. since service wasnt denied i dont think \<waxwing> not clear at all; but do you remember that sql cursor crash thing? \<belcher> yep \<waxwing> if you remember i had great trouble reproducing, but i tend to think more likely than not, it was a result of that \<belcher> another issue is right now the sql database is in RAM, but it would be fine to make sqlite store it on disk instead \<belcher> the sql cursor thing was almost certainly a synchronization issue i think \<waxwing> but just generally why should we allow people to publish huge quantities of data to IRC? it's like, even if they're wrong that they're getting an advantage, it's much better to just kill it and forget about it. \<waxwing> well it was two threads trying to access db at same time, no doubt about that. \<belcher> yep \<waxwing> my only doubt was whether it was happening because of huge offer-lists, i think so, but can't be sure. \<belcher> so the limit-number-of-offers-per-nick works to stop people doing dumb things that dont actually help them (which is a good outcome) but doesnt stop a determined DOSer, although a determined DOSer cant do much anyway \<waxwing> yes, let's say. i'm reasonably confident nobody was trying to kill the system via DOS, I think they were just trying to get advantage. \<belcher> do you want to write up the maker-utxo-podle-commitment idea into a gist/blog post? \<waxwing> sure,but it's ill formed at the moment. i'll write down the outline to 693 though. \<belcher> whats ill-formed about it? \<waxwing> i think your nonce idea is a good one. that probably improves it a bit. \<waxwing> ill-formed, well i mean more like not fully formed. \<belcher> ah \<belcher> writing a blog post or gist hopefully helps with that because it makes the author have to think it all through \<waxwing> yes, but i think i'll write an outline to 693 first. give those interested a bit more time to digest, including me. \<waxwing> you can add your additional thoughts there \<belcher> ok \<belcher> the idea seems good so far :) it might solve the issue \<waxwing> someone needs to get cracking on #650 then :) \<waxwing> i started writing it but will probably -> dinner now, so later \<waxwing> https://github.com/JoinMarket-Org/joinmarket/issues/693#issuecomment-280936089 \<waxwing> hmm,no, error - the way i proposed using the nonce there doesn't work. \<waxwing> will have to think about it again. \<waxwing> oh ofc, just do H(P2||nonce) yeah. will edit.

chris-belcher commented 7 years ago

The taker needs to have access to the UTXO set, correct? They need to be able to know that a UTXO is truly unspent and what pubkeyhash it lives on.

If so, that makes it harder to do #653 because that has the problem that it doesn't have trustless access to the UTXO set. #682 is a solution but just amounts to trusting makers, the very entity we're trying to stop cheating here. There might be a way to make it work but requires a bit of thinking about.

AdamISZ commented 7 years ago

At first it seems clearly right; but then I wondered about the fact that we have the pubkey in receiving !sig. This started me thinking about whether it mattered whether we do the podle verification check in receiving !ioauth or receiving !sig. And this in turn started me thinking about the economic incentives in more detail. I wrote this: https://gist.github.com/AdamISZ/afc1ed3542d0aefaace718c029dc4a81

If the thinking there is right, there might be an argument that: considering only economic incentives, lying about the podle by using an invalid pubkey can only lead to creating failed transactions, not economic advantage, even if the Taker just takes the pubkey in the sig as given (i.e. doesn't check via utxoset). That would involve only checking the podle after !sig, not after !ioauth.

chris-belcher commented 7 years ago

For economic incentives its easier because a coinjoin just won't be accepted by the network unless everything is valid.

I'm more concerned about DOSers who want to deny service with a wider reach. For example, one way to detect an invalid UTXO is to notice that pushtx() always fails. Then the taker could ban all the makers, but that allows one DOS maker to get 5-6 other makers banned by that taker.

AdamISZ commented 7 years ago

Yes, seems fairly clear: you either check the validity of a commitment upfront, or you allow the blockchain to check it for you. In the latter case, you can't detect the cheater yourself, only that there has been cheating.

Clearly the superior model is to actually check it oneself, to detect the origin of the bad behaviour, so it's a question of whether we make that a requirement of all participants. If not, one can hope that the absence of an economic incentive is enough.

If absence of economic incentive isn't enough, we're solving a much harder problem. There are other DOS vectors than this.

I would just say, forget it, Taker needs to be able to check utxo commitments, but I'm keenly aware that the possibility of an SPV mode is very important, and if this breaks it, that's not something to be brushed aside, not at all.

Meanwhile, I'm leaning towards believing this is a valid approach, but I want the message channel to support the kind of bursty bandwidth we get in the orderbook return. Feels like we stretched IRC to its limit.

chris-belcher commented 7 years ago

That's a good point, that is someone wasn't economically motivated they could DOS joinmarket today.

Regarding IRC, I just came up with #713 which might be good at smoothing out the traffic on !orderbook. Trying to think of any other places where IRC would be stretched the most, maybe the !sig and !tx replies are the biggest after !orderbook

chris-belcher commented 7 years ago

In the maker wallet utxo podle commitment idea, the taker should send the nonce it used to the makers when filling an order, to avoid the makers having to keep a record of every nonce.

AdamISZ commented 7 years ago

@chris-belcher I didn't understand that, I thought the idea was to add the nonce to the orderbook request so the makers can send their podles with their list of offers? (!orderbook <nonce>)

chris-belcher commented 7 years ago

@AdamISZ Yes, the taker uses !orderbook <nonce>, then afterwards when the taker sends !fill <arguments> it will also send the nonce it used. The maker will need that information when opening the podle commitment. It avoids having the maker keeping track of every nonce.

AdamISZ commented 7 years ago

@chris-belcher got it, thanks.

chris-belcher commented 7 years ago

I had an idea to massively cut down on the amount of data transmitted in the "podle for committing to maker's UTXOs" idea.

Some background first: Bloom filters are a probabilistic data structure. They can be used to test whether an element is a member of set. If the test returns false then the element is for sure not in the set. If the test returns true the element is either a member of the set or is a false positive. So set-membership can return false positives but not false negatives. The false positive probability can be controlled.

A bloom filter is a bit array. To add an element to the set, hash it and use that to set certain bits in the bit array to be 1. To test whether an element is in the set, hash it and check whether the bloom filter bit array has 1s in all the correct positions. Also, set intersections can be tested for by ANDing the two bloom filters.

To apply this to the maker podle UTXO commitments, the maker would calculate all the UTXO podles and add them to a bloom filter. It would send that bloom filter to the taker who requested !orderbook. The taker would check for any set intersections of the bloom filters. If there are any intersections then that is a strong indication that those makers are using the same UTXO multiple times and the taker will discard them. When the taker chooses makers and sends !fill to them, the makers must send their podle hash along with the other proof for opening the commitment. The false positive rate of the bloom filters will be chosen so that a false positive intersection happens very rarely.

Since the bloom filter is a fixed small number of bytes, this scheme reduces the data transmitted and makes it scale as O(1) with respect to the maker's UTXO count.

Attacks / Mitigations

A malicious maker could send a bloom filter set to all 1s which would match with everything. To solve this, the taker must be coded to first reject bloom filters that match many other bloom filters. In other words it must retain as many bloom filters as possible with the requirement that they don't intersect. I.E. If there are filters A, B, C, D and E. And E matches A and B but A,B,C,D dont match each other, then reject E, not A or B. It should be possible to find E in this case with a scalable algorithm by using a divide-and-conquer binary search style algorithm. For example batch verifying all the bloom filters by ANDing them together, and if there is a match then split them into two groups and AND all those, etc continue until you find the intersecting filters.

Another attack is that a DOSer could connect with many many makers and send out lots of random bloom filters hoping to collide with genuine makers. To help against this, when faced with two bloom filters that intersect each other and don't intersect with any other filter, the taker should just randomly choose one to reject. Also it's worth noting that a very similar DOS to this is possible today by lots of fake makers just announcing offers and never filling them.

Finally, a possible attack would be faking bloom filters after the fact. I.E. is it possible to come up with another podle commitment that still matches the advertised bloom filter. If a cryptographically-secure hash function is used in the bloom filter algorithm then I'm pretty sure the answer is no, because finding such a collision would involve partially bruteforcing sha256 or sha512.

chris-belcher commented 7 years ago

\<belcher> gmaxwell any idea what data structure is referred to here are being 50% more compact than bloom filters? https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-January/013428.html i asked adam back (the author of the email) and he suggested to ask you \<gmaxwell> belcher: an entropy coded bitmap is almost twice as efficient as a bloom filter. They're just costly to update. \<gmaxwell> e.g. you build a filter one a single hash function, so it's very big and sparse, and then entropy code it so that it only takes the minimum amount of space. \<gmaxwell> cuckoo filters also can achieve capacity, and from an engineering perspective may be easier to deal with than a compressed bitset.

AdamISZ commented 7 years ago

It would be great if someone else started looking at this apart from us 2 and chimed in. My feeling is that the bloom filter approach is adding complexity it'd be better to avoid.

I wrote this some short while ago:

Mar 01 22:31:44 back of envelope, for background: est. 30 utxo/wallet, est 25 bytes of truncated 16 byte commitment with encoding, est 150 counterparties: 112kB !orderbook response. Mar 01 22:31:52 many webpages deliver 10x that :)

It just seems to me the right solution is a message channel implementation that is more forgiving than a public IRC (which is explicitly designed for human chat) is the right thing to do.

Adding compactification, and then having to deal with attacks, seems a bit against the spirit of the idea (which was to remove any doubts about getting fakes).

None of these comments mean I'm sure one way or the other, it's just my feeling, currently.

eduard6 commented 7 years ago

Questions before I write a longer reply:

The communication order is:

T: !orderbook <nonce>
M: !ordertype ... !ordertype ... !mhp2 MPC1 !mhp2 MPC2
T: !fill order-id amount tencpubkey TPC1
M: !pubkey ...
T: !auth TPP
M: !ioauth ... MPP
T: !tx ...
M: !sig ...

(M|T)PCn: Maker/Taker PoDLE commitment (M|T)PP: Maker/Taker PoDLE parameters (P, P2, s, e) !mhp2: Used instead of !hp2 for clarity

After Taker gets MPP, Taker can decode MPCn, revealing the UTXO (TXID)?

If that is correct and Taker can decode all of the MPCn, then the first problem I see is Taker gets access to all of Maker's UTXOs but only gives up one of its own.

And !mhp2 is not like !hp2, because MPCn is only used by one Taker, for one transaction, right? It could be sent to Taker in private?

chris-belcher commented 7 years ago

The taker gets all the commitments but not all the parameters i.e. not every commitment is opened, so the taker only learns the UTXOs that will be used in the coinjoin.

Yes, what you're calling !mph2 would be specific to that one !orderbook request because it involves the nonce, and therefore that information could be PM'd to the taker.

eduard6 commented 7 years ago

More comments.

Makers can still get access to Taker's UTXO without using up a UTXO of their own (by either using fake MPCn or not responding to !auth).
What punishment is there for Makers in using a fake MPCn or not responding to !auth? It is true they will not be able to Sybil as easy by being multiple participants in the transaction but they still get access to Takers' UTXOs. If a Taker does not get valid MPP from Maker it will blacklist that Maker but that really does not help much because it will be only a few Makers per JMTX attempt. If there is a Sybil attacker they might launch thousands of Makers so blacklisting by one Taker will not have much effect.
A Sybil attacker could do a hybrid attack. They offer some valid MPCn and some invalid MPCn but decide whether to provide valid MPP based on the !fill amount and/or how many of their Sybil bots were contacted by Taker (Takers contact Makers in parallel, right?) The attacker could even keep track of a Taker's multiple requests. They could deny requests until the Taker chose enough of their bots. This could be a reason for Takers to contact Makers in serial if they don't now do it that way.
I might be wrong but even in the current method if Taker gives valid TPC and TPP what prevents them from getting a UTXO from each Maker using that one TPC/TPP, by sending parallel !auth commands to many Makers? I think it is a race between Makers sending !hp2 and responding to !auth. Makers maybe should have a random delay before responding to !auth so they will have time to see !hp2 messages from other Makers.
Taker provides amount in !fill, so an honest Maker would provide MPP for just those UTXOs that are necessary to fill that amount?
I agree with @AdamISZ about the bloom filter. Neat idea but unnecessary when a proper data channel is used and adds complexity and uncertainty. And the data is not so big. The effort of switching to a proper data channel would probably be the same as adding the bloom filter and the result would be simplification instead of extra complexity.

Overall I think this idea:

Limits but doesn't prevent a Sybil attacker from gaming the system to becoming multiple participants in a JMTX
Does not stop Makers from spying on Taker UTXOs

chris-belcher commented 7 years ago

@eduard6 ~~Makers don't learn the taker's UTXO in the protocol. The taker sends a PoDLE commitment to a UTXO, not the UTXO itself.~~ EDIT: I forgot how taker's PoDLE's work.

What they could do is DOS, you're right about that. That would use up some of the taker's UTXOs for commitments. As you mention the only help against that is to not interact with that maker again. From what I see, the proposal in this thread of requiring the maker's PoDLE commitments doesn't make this DOS any worse. That DOS already exists in joinmarket today. (Of course it's still bad, but doesn't affect this proposal)

You are right about a malicious taker sending !fill to every single maker in an effort to obtain their UTXOs. I think we discussed this a bit in #156 and it is hoped that the 20% amount requirement and 5 confirmations would be enough to severely rate-limit this. This blog post also describes the taker's PoDLE commitment scheme as a race https://joinmarket.me/blog/blog/racing-against-snoopers-in-joinmarket-02/

Taker provides amount in !fill, so an honest Maker would provide MPP for just those UTXOs that are necessary to fill that amount?

Yes, only the UTXOs that the maker intends to coinjoin need to have their commitments opened as proof.

chris-belcher commented 7 years ago

Regarding the bloom filter.

I also don't have strong feelings either way but here is some more analysis.

Of course I agree that another messaging channel is a great idea, but the two ideas don't exclude each other. For the new messaging channel it has to be said that we don't know of an idea that works yet.

The main advantage of the bloom filter idea is it converts O(N) scaling to O(1) scaling. Although that back-of-the-envelope calculation finds a reasonable value for bandwidth, its still subject to O(N) scaling in UTXO-count. Your calculation assumes 30 UTXOs but here are two situations where that number could be much higher:

UTXO-aggregators: In future JoinMarket could be used by people who want to pay lower miner fees, especially if/when schnorr signature aggregation is added to bitcoin. These people would own a lot of UTXOs that they want to combine together as cheaply as possible. So their UTXO-count would be much much larger than 30. I don't have the TXIDs now but recently I've seen transactions with several hundred inputs going to one output. These use-cases are good for bitcoin and joinmarket because they increase bitcoin's fungibility by giving a wider range of people more reason to create coinjoins, and also increase the scale of bitcoin by making the scarce block space be used more efficiently.
Adversaries: An adversary of joinmarket could set N as high as they like, to force takers and the messaging channel to download very large amounts of data. We could set a limit on this bandwidth but that would cut out the UTXO-aggregator use case above.

Adding compactification, and then having to deal with attacks...

I think focusing on the dealing with attacks is unfair against the bloom filter idea, most ideas have possible attacks, spending a lot of text on mitigating attacks doesn't mean the idea is vulnerable or complicated.

If we go over the attacks I wrote about again: the third attack is not a problem and nothing needs to be done against it, the second attack requires a slight change (when faced with two intersecting filters, choose one randomly to discard instead of discarding the first-seen or second-seen) and the third attack requires someone to keep an idea in mind when writing the comparison code that would have been written anyway.

...seems a bit against the spirit of the idea (which was to remove any doubts about getting fakes).

There won't be doubts. Bloom filters have false positives but not false negatives. So if two filters result in false when checked for set-intersection, then we can be 100% sure that the wallets do not have any UTXOs in common.

False positives means that sometimes one maker will be (unfairly) discarded in favour of another just by chance, this can happen with probability 1% or 0.1% or whichever we choose. It will depend on the taker's nonce value so will change every time and will hurt every maker equally.

This is my feeling for the best way to go from here

Discuss more
Learn about how other projects like bitsquare handle p2p, maybe they've thought of something useful to us.
Code the maker's wallet PoDLE commitment idea without bloom filters for IRC. #713 can be used to make the IRC traffic less bursty. Set a limit on number of UTXOs to deal with the O(N) issue. Make it extensible so it can also be modified as easily as possible for another messaging channel and bloom filters if/when they get made.

AdamISZ commented 7 years ago

@eduard6 some brief response to your main Qs:

What punishment is there for Makers in using a fake MPCn or not responding to !auth? It is true they will not be able to Sybil as easy by being multiple participants in the transaction but they still get access to Takers' UTXOs. If a Taker does not get valid MPP from Maker it will blacklist that Maker but that really does not help much because it will be only a few Makers per JMTX attempt. If there is a Sybil attacker they might launch thousands of Makers so blacklisting by one Taker will not have much effect.

Indeed, the design as exists forces the taker to reveal one utxo, although it doesn't have to be connected to the transaction. I struggle to see an alternative, unless perhaps based on fair-exchange protocols (I get secret X iff you get secret Y) (I have a vague sense it's impossible without a TTP; in tumblebit, darkleaks and the like, the TTP is the blockchain). There are a lot of subtleties; in a complete JM tx run, the input utxos are all exposed due to the effectiveness of joinmarket-sudoku, the only claim is that fungibility is gained in the coinjoin outputs. If a maker aborts early, knowing the utxo, it's debatable what that gains; knowledge that a user wants to gain fungibility on utxo U, which is presumably not currently deemed sufficiently fungible; and that knowledge will in any case be gained in public if U is used in a future JM tx. But I agree, there is a privacy leak here of some sort in some scenarios. I don't (yet) see a better way.

A Sybil attacker could do a hybrid attack. They offer some valid MPCn and some invalid MPCn but decide whether to provide valid MPP based on the !fill amount and/or how many of their Sybil bots were contacted by Taker (Takers contact Makers in parallel, right?) The attacker could even keep track of a Taker's multiple requests. They could deny requests until the Taker chose enough of their bots. This could be a reason for Takers to contact Makers in serial if they don't now do it that way.

Yes, they can do this. The question is, what kind of attack. From context it seems on this point you're still talking about the previous attack you mentioned, learning taker utxos. If you were talking about economic attack I tried to address that in the previously mentioned gist. So assuming the former, my previous response applies.

I might be wrong but even in the current method if Taker gives valid TPC and TPP what prevents them from getting a UTXO from each Maker using that one TPC/TPP, by sending parallel !auth commands to many Makers? I think it is a race between Makers sending !hp2 and responding to !auth. Makers maybe should have a random delay before responding to !auth so they will have time to see !hp2 messages from other Makers.

Indeed, this is also inherent in the current design (takers can get a few utxos from every Maker by filling all of them); in fact, @adlai was working up a taker-sophistication algorithm to use that to advantage (query a larger-than-needed set; try to do a tx, keep some in back pocket). And the design was specifically set up to allow it (so not the delay approach you mention). Remember, in the pre-podle design, requests were free and attackers used that to (attempt to) gain a very full picture of the maker utxo offer set, as it evolved over time. Podle rate-limiting can only increase the cost of that. I was originally worrying that it would be too cheap, but the attacks entirely stopped taker-side after it (my sneaking suspicion is because attackers don't want to reveal even one utxo).

Taker provides amount in !fill, so an honest Maker would provide MPP for just those UTXOs that are necessary to fill that amount?

Yes.

Limits but doesn't prevent a Sybil attacker from gaming the system to becoming multiple participants in a JMTX

Yes, they can become multiple participants but only in the sense of using different subsets of their utxo set as "different participants", which reduces to multiple isolated wallets effectively. The basic proposal completely removes the possibility of offering the same utxo from 2 different maker bots, and thus gaining a probabilistic economic advantage.

Does not stop Makers from spying on Taker UTXOs

To summarize the first response, it's questionable how much of an attack it is (fungibility of proposed U), and I don't see an alternative to impose cost on requests (which is definitely necessary). I for sure agree it is not perfect!

AdamISZ commented 7 years ago

@chris-belcher re:

Of course I agree that another messaging channel is a great idea, but the two ideas don't exclude each other. For the new messaging channel it has to be said that we don't know of an idea that works yet.

I guess that is something I was just tacitly assuming; via #650 (or whatever), I couldn't see why it should be such a big deal, especially if we totally ignore an attempt to P2P it. It's not hard to either (a) set up an IRC server for the specific function, with throttling turned way down but other simple anti-DOS intact, or even just write a small tcp server. The functionality required is about as simple as it gets. The only thing that gives me pause is it's a little more difficult given that it must be Tor enabled. But even if I wouldn't be able to knock that together in 1 day, there are other people who surely could.

eduard6 commented 7 years ago

A Sybil attacker could do a hybrid attack. They offer some valid MPCn and some invalid MPCn but decide whether to provide valid MPP based on the !fill amount and/or how many of their Sybil bots were contacted by Taker (Takers contact Makers in parallel, right?) The attacker could even keep track of a Taker's multiple requests. They could deny requests until the Taker chose enough of their bots. This could be a reason for Takers to contact Makers in serial if they don't now do it that way.

Yes, they can do this. The question is, what kind of attack. From context it seems on this point you're still talking about the previous attack you mentioned, learning taker utxos. If you were talking about economic attack I tried to address that in the previously mentioned gist. So assuming the former, my previous response applies.

The attack I was thinking of was being most or all of the Makers in a transaction. Assuming Sybil attacker has thousands of bots. Steps:

Taker selects orders from 3 non-attacker bots and 3 attacker bots
Taker sends !fill to each bot in parallel
Attacker gets 3 !fill messages from Taker and waits 10 seconds before making a decision to respond or not. Decides 3 is probably not most of the Makers and doesn't respond
Taker attempt times out
Taker selects new orders. This time from 1 non-attacker bot and 5 attacker bots
Taker sends !fill to each bot in parallel
Attacker gets 5 !fill messages from Taker and waits 10 seconds before making a decision to respond or not. Decides 5 is probably most of the Makers and responds
Taker finishes transaction
Attacker can now eliminate most of the outputs when analyzing the TX

Now that I think about it the attacker could do this with !sig anyway. They always respond to !fill messages and then examine the !tx and see how many of their bots are included and then selectively send or not send !sig until they get a !tx with many of their bots included.

Sending the !fill messages in serial would mean that the attacker would have to use some of their UTXOs which would be an improvement but wouldn't totally prevent the selective !sig method.

eduard6 commented 7 years ago

Edit: This comment moved to new issue #724

AdamISZ commented 7 years ago

@eduard6 (sorry for delayed response)

The attack I was thinking of was being most or all of the Makers in a transaction

I realise this doesn't directly answer your question, but I think a tangent might be worthwhile at the start:

This is the most fundamental critique of Joinmarket, and it applies outside of this issue specifically. In fact, some people pretty much immediately dismiss Joinmarket (at least in current design) because of it. @chris-belcher , I and others who were here from the start did not consider it a valid reason to reject the design (with anonymous participants making a market), because of a certain perspective: coinjoin is an opportunistic attempt to improve fungibility, which can't make it worse; attacking through Sybil only fully works when being all other participants; more than one actor trying to Sybil degrades the value of the attack to each attacker; random choice among participants means only a small number of honest counterparties stops the attack from working (if not 100%). You get the general idea.

(Another thing that often crops up is "doesn't Coinshuffle fix this"? The answer is no, it doesn't. Coinshuffle prevents one party from gaining full knowledge of mappings, what it doesn't (and can't) do is prevent Sybil attacks where all counterparties are the attacker; in that case cryptography cannot help, it is just a process of elimination to find the "victim")

Another aspect of the general Sybil issue is counterparty selection. The more deterministic the selection process is, the more an attacker can try to game the system to swamp honest participants. The basic approach to defending that is IMO just randomization; the current default weighted_order_choose is random, but heavily weighted by price. The paper that was written on Joinmarket argued that swamping the orderbook with cheap prices gives the attacker an opportunity to exploit this, but I don't consider it a big deal, not only because honest counterparties can also be cheap, but more importantly because we could have Takers be much more price insensitive: e.g., "random under a maximum". Although I haven't got round to doing it, it is very easy to code (and doesn't require consensus) to do this.

Attacker gets 3 !fill messages from Taker and waits 10 seconds before making a decision to respond or not. Decides 3 is probably not most of the Makers and doesn't respond

Taker attempt times out

In this case completion-with-subset allows the Taker to complete anyway, he doesn't need to timeout (and with default settings, won't). I think complete-with-subset is pretty important, btw Coinshuffle has to go through a bunch of steps specifically to allow this (put another way, to prevent this DOS attack); for our "taker" concept, it's easy (ignoring some details).

Now that I think about it the attacker could do this with !sig anyway.

This is indeed viable, but: with the defence described in this thread, the Taker will reject repeated utxos in the reveal in !ioauth, if there are repeats. Of course this doesn't prevent the general "swamp the orderbook" issue, but that's discussed at length above, not really to do with the defence against "copy" bots being discussed in this thread.

chris-belcher commented 6 years ago

This maybe seems like maybe an obvious insight, but why do the commitments have to use podle at all? Could makers just publish sha256(utxo + '|' + takers_nonce) instead? Podle is required if you need to transfer the proof to someone else.

BTW I wrote this summary a few months ago here: https://www.reddit.com/r/joinmarket/comments/5wyfju/wallet_duplication_by_makers/ Everyone probably has already seen it but it's good to cross-link reddit/github both ways.