TokTok / c-toxcore

The future of online communications.
https://tox.chat
GNU General Public License v3.0
2.24k stars 283 forks source link

Proposal for reworking the way onion paths are created #596

Open zugz opened 7 years ago

zugz commented 7 years ago

https://github.com/zugz/tox-onionPathsProposal/blob/master/onionPathsProposal.md

This expands substantially on #547.

zoff99 commented 7 years ago

@zugz the risk here is "only" that an attacker can possibly know what friends we are connected to. so our friendship net. but the message contents are safe.

is there also a DoS possibility? that the attacker can make our messages never arrive at the correct friend?

zugz commented 7 years ago

@zugz the risk here is "only" that an attacker can possibly know what friends we are connected to. so our friendship net.

If we're connected via TCP relays which the attacker doesn't control, then yes. Otherwise a deanonymisation attack is also possible - see the "current implementation" section in the proposal for details.

but the message contents are safe.

To my knowledge yes, in all cases.

is there also a DoS possibility? that the attacker can make our messages never arrive at the correct friend?

The only DoS attack that I know of is that an attacker can occupy our announce neighbourhood and pretend it hasn't seen us when others ask. This prevents us receiving friend requests. If a friend of ours is simultaneously attacked in same way, then we and the friend won't be able to find each other. This is a serious problem, and there are ways we could try to mitigate it, but it's a separate issue which this proposal doesn't touch on at all.

GrayHatter commented 6 years ago

@zugz the risk here is "only" that an attacker can possibly know what friends we are connected to. so our friendship net.

If we're connected via TCP relays which the attacker doesn't control, then yes. Otherwise a deanonymisation attack is also possible - see the "current implementation" section in the proposal for details.

If the attacker has access to the users ISP, then yes. A silent DoS would be trivial.

but the message contents are safe.

To my knowledge yes, in all cases.

This is correct, (the birthday attack still applies)

is there also a DoS possibility? that the attacker can make our messages never arrive at the correct friend?

The only DoS attack that I know of is that an attacker can occupy our announce neighbourhood and pretend it hasn't seen us when others ask. This prevents us receiving friend requests. If a friend of ours is simultaneously attacked in same way, then we and the friend won't be able to find each other. This is a serious problem, and there are ways we could try to mitigate it, but it's a separate issue which this proposal doesn't touch on at all.

You could use this attack to learn/confirm a friends list. There's also another attack that would allow you to force any DHT node it's friends list. Combining the two attacks, you could trivially, and silently, DoS selected connections at will/whim with very little resources.

nazar-pc commented 6 years ago

Couple of things about proposal:

Could you add some definitions section or something to make it easier to understand?

nazar-pc commented 6 years ago

Just to throw it somewhere, don't you think creating multiple DHT nodes can be substantially slowed down by involving some computationally heavy operation? Something like blocks mining in crypto currencies, but connected to keypair. It might make keypair generation waaay slower than 0.0007s.

zugz commented 6 years ago

Could you add some definitions section or something to make it easier to understand?

I just added an "overview" subsection to the "proposed system" section; hopefully it clarifies things?

zugz commented 6 years ago

The only DoS attack that I know of is that an attacker can occupy our announce neighbourhood and pretend it hasn't seen us when others ask.

You could use this attack to learn/confirm a friends list. There's also another attack that would allow you to force any DHT node it's friends list. Combining the two attacks, you could trivially, and silently, DoS selected connections at will/whim with very little resources.

Could you explain these attacks a bit more? Are you assuming the attacker can observe the target's network traffic for the first attack? What's the second attack?

zugz commented 6 years ago

Just to throw it somewhere, don't you think creating multiple DHT nodes can be substantially slowed down by involving some computationally heavy operation?

Proof-of-work would help against some of the attacks, yes. But there's always the problem of choosing the parameters to make attacks sufficiently difficult while not inconveniencing ordinary users too much, and having this remain the case as hardware improves. Anyway, it wouldn't help much against attackers occupying the announce neighbourhoods of particular targets, since we can assume the attacker is willing to spend the necessary resources. They could also precompute enough keys to quite densely cover key space, such that they have some quite close to any given point.

nazar-pc commented 6 years ago

I just added an "overview" subsection to the "proposed system" section; hopefully it clarifies things?

It does, thanks!

But there's always the problem of choosing the parameters to make attacks sufficiently difficult while not inconveniencing ordinary users too much, and having this remain the case as hardware improves.

Definitely, but I think this is an interesting area to explore, not only for anonymity and security, but also for robustness and scalability.

robinlinden commented 6 years ago

@irungentoo Can you have a look at this proposal?

zugz commented 6 years ago

I've just added a section on the problem with the current DHT eviction mechanism and a sketch of a proposed solution.

https://github.com/zugz/tox-onionPathsProposal/blob/master/onionPathsProposal.md

I now believe that the proposal deals with all feasible attacks aimed at deanonymising users or obtaining information about the friend graph. I haven't really tried to formalise what "feasible" means here, but it at least means that the cost of any remaining attack targetted at a particular victim should scale at least linearly with the size of the tox network for a fixed chance of success, assuming that the number of bootstrap nodes scales linearly with network size.

If anyone can see any remaining vulnerabilities, or ideas for simplifications/optimisations, please tell. Else, I'm planning to actually start work on implementation.

zoff99 commented 6 years ago

@zugz if the save format changes, will the be a migration function in toxcore? i would say we should start versioning pretty much everything. savefile -> with version toxAV -> with protocol version

so that toxcore can see what version is used, and also migrate on future changes. same goes for ToxAV protocol. that newer toxcore can handle different protocol versions.

robinlinden commented 6 years ago

@zoff99 Saves are already marked with what version they were saved at: https://github.com/TokTok/c-toxcore/blob/0fce3fc94cac81c8eb8eda52dfb281ccc992d5bf/toxcore/Messenger.c#L2665 https://toktok.ltd/spec#state-format

zoff99 commented 6 years ago

@robinlinden excellent. so we just need to migrate that when loading

nazar-pc commented 6 years ago

Well, everything boils down to building a DHT that is resistant to an active attacker. DHT node should be able to protect itself from neighborhood poisoning even at early stages of operation. I don't yet see a grand vision in mentioned proposal yet, moreover, there is no mathematical proofs how much better is it.

In one of the issues I've already mentioned Brahms: Byzantine Resilient Random Membership Sampling: https://gnunet.org/sites/default/files/Brahms-Comnet-Mar09.pdf If someone haven't take a look at it yet, I'd like to suggest you to do so. At least it contains analysis of the properties of proposed solution (which do not yet fully understand to be honest) and it seems to provide pretty good guarantees that can be used as foundation for improved version of DHT in contrast to patching existing solution.

zugz commented 6 years ago

Well, everything boils down to building a DHT that is resistant to an active attacker. DHT node should be able to protect itself from neighborhood poisoning even at early stages of operation.

Yes. It's true that poisoning the DHT is a way to attack the onion, and that what I'm proposing for the DHT doesn't actually make it immune to poisoning, just more expensive to poison.

You can still bombard a node with Nodes Requests from nodes you create using random DHT pubkeys, until eventually you hit upon the right keys and they add your nodes. A more subtle but less effective attack is to bias the closelists of peers by responding to Node Requests with information about only your own nodes. (These are the manifestations in Tox of the 'push' and 'pull' poisonings discussed in the Brahms paper.)

In one of the issues I've already mentioned Brahms: Byzantine Resilient Random Membership Sampling: https://gnunet.org/sites/default/files/Brahms-Comnet-Mar09.pdf

Thanks for pointing to this again. It does have some interesting ideas which may well be applicable. It looks like we could even implement much of it without breaking backwards compatibility (though not the "limited send" component).

I think for now we should leave making the DHT fully unpoisonable to future work. In this issue I want to concentrate on the onion modulo the DHT, i.e. making the onion do its job when we assume that the DHT is doing its job. As long as poisoning the DHT is expensive, this is worthwhile in itself.

zugz commented 6 years ago

I think for now we should leave making the DHT fully unpoisonable to future work.

Thinking again: if it's really possible to ensure uniform local views of the network, we can directly use that to pick path nodes for the onion. This would be much better than the solution I'm proposing here, which is based on the uncomfortable assumption that the bootstrap nodes are not colluding.

Looks like I should spend some time with the literature!

nazar-pc commented 6 years ago

Thinking again: if it's really possible to ensure uniform local views of the network, we can directly use that to pick path nodes for the onion.

That was exactly my point!

zugz commented 6 years ago

That was exactly my point!

Yep, sorry to be slow on the uptake!

zugz commented 6 years ago

Sybil attacks are the fundamental problem. If a Sybil attacker can poison our local view of the network, onion paths we build from it will be compromised. In particular, the "Brahms" system linked to above specifically assumes away Sybil attackers. Namely, it assumes that attackers have access to only so many fixed network identities (DHT keys in the tox case) - if the attacker can generate new ones at will, the guarantees in the paper don't apply.

Section 3 of http://www.distributed-systems.net/my-data/papers/2011.acm-cs.pdf gives a nice overview of defences against Sybil attacks on DHTs. If we assume we want to be able to resist attackers using a botnet, then there are only two types of known solution. One is to have a central authority certifying users, which I assume we agree is inappropriate. The other is to exploit features of human social networks. This may well be appropriate.

The most relevant systems of this kind are SybilGuard http://www.math.cmu.edu/~adf/research/SybilGuard.pdf , and its successor SybilLimit https://cs.nyu.edu/srg/docs/sybillimit-tr.pdf .

The basic idea is that a Sybil attacker will have only so many friend relations with honest nodes ("attack edges" in the friend graph); using this, and assuming that the honest part of the graph is connected and fast mixing, it's possible for an honest node to verify whether a given node is honest with a low false positive rate.

So my current thinking is that we should adapt SybilLimit, using the Tox friend graph, then use it to verify DHT nodes. This isn't straightforward - SybilLimit assumes a static network of always-on peers, with no privacy requirements - but it looks like it should be doable. The costs in network traffic would be considerable, but hopefully bearable.

This would give users resistance to Sybil attacks once they're part of a sufficiently large connected component of the friend network. Then a Brahms-like system can be used to get an unpoisoned view of the component, from which onion paths can be built.

Plenty of details need to be sorted. But I wonder what people think of this overall vision? It's complicated and might involve a compatibility break, but as far as I can see it's the only possibility for a true solution. And I'm not convinced that anything less than a true solution is worth having at all.

nazar-pc commented 6 years ago

I agree that nothing less than complete solution is worth the effort, as it will not fundamentally change situation.

Yes, Brahms assumes that nodes should not have multiple IDs, which is not something that is possible to guarantee in true distributed network.

I'll look at SybilGuard and SybilLimit now, thanks for the links!

nazar-pc commented 6 years ago

So I've looked at SybilGuard and then at SybilLimit and I have one major issue with it: both of them assume that attacker knows the whole friends graph, which is, I believe, one of the things Tox aims to protect from.

nazar-pc commented 6 years ago

I've read a few more researches and came to conclusion that it might be effective to require each node to store some significant amount of state that depends on node ID in order to function. Something like 100M is quite feasible on average machine (though seems expensive for such a simple app like messenger), but an attacker will need 1G of RAM for each 10 fake nodes. As RAM is a relatively scarce and expensive resource, I think it might work.

Have anyone seen studies containing such approach?

iphydf commented 6 years ago

100M is going to make mobile users really sad.

zoff99 commented 6 years ago

yes, very sad.

zoff99 commented 6 years ago

@nazar-pc how is RAM scarce? swapping on disk is easy.

sudden6 commented 6 years ago

@zoff99 not if you need true random access, see the stuff Etherum uses.

nazar-pc commented 6 years ago

Yes, RAM should be used in a way that will make swapping even to SSD inefficient. 100M was an example, which will work fine on modern mid-range phones that have 1-2G of RAM, while top models have up to 6G.

It can be 10M instead, depending on security properties and network size.

zugz commented 6 years ago

So I've looked at SybilGuard and then at SybilLimit and I have one major issue with it: both of them assume that attacker knows the whole friends graph, which is, I believe, one of the things Tox aims to protect from.

The protocols described in the paper leak a lot of information about the friend graph, yes. But I believe I see a way to adapt SybilGuard (though not SybilLimit) to deal with that, in a way which hopefully can be made not too expensive in network traffic. I have a half-written account of it; I'll finish it off and show it here soon.

zoff99 commented 6 years ago

@nazar-pc 100MB+ RAM for 1 app on a phone is really not possible now. most phones will kill other apps. remember this thing is a background app, until there is some sort of PUSH service for Tox.

zugz commented 6 years ago

but an attacker will need 1G of RAM for each 10 fake nodes.

Just limiting the number of concurrently active fake nodes isn't really enough. At least to prevent a targetted attack on Brahms, we also need to ensure an attacker can't cycle through lots of identities.

nazar-pc commented 6 years ago

Brahms just assumes there is a authority that issues identities, simply because of this fact I think it is out of scope of our needs. Looking forward for your SybilGuard modification.

zugz commented 6 years ago

Brahms just assumes there is a authority that issues identities,

Right, but as far as I can see, all it actually needs is that an attacker can't create too many identities which will be accepted by honest participants. So a SybilGuard-like system should do just as well in place of a central authority.

nazar-pc commented 6 years ago

@zugz, do you have something to share already?

zugz commented 6 years ago

@zugz, do you have something to share already?

Sadly not. The idea didn't work out. Sorry for not saying earlier.

There are lots of problems, but the fundamental one is this: any such scheme requires us to have a long-term identifier which we share with arbitrary DHT nodes, which they will try to use to determine whether or not we're sybil. But this in itself is a big problem: since friend relationships between nodes identified by IP address / DHT public key are not kept private in tox, a crawler can easily map out the friend network indexed by these long-term identifiers. That's a big privacy problem.

So I'm coming reluctantly to the conclusion that we should give up on making the DHT sybil-resistant. That does also mean giving up on the onion.

Grayhatter suggested using per-edge rendezvous spots, and I'm belatedly coming round to the idea that this might be the best (only?) solution. But that's a topic for another issue.

nazar-pc commented 6 years ago

I was recently messaged with presentation called "S/Kademlia: A Practicable Approach Towards Secure Key‐Based Routing": https://pdfs.semanticscholar.org/3165/2823ca71520038773346b6e5bbfadc5c8419.pdf

I found it very interesting and planning to use it in my project, which at the moment has Mainline DHT-like implementation (with some inherent hardening on top).