TokTok / c-toxcore

The future of online communications.
https://tox.chat
GNU General Public License v3.0
2.27k stars 285 forks source link

Tox should be resilient to Sybil attack. #517

Open yurivict opened 7 years ago

yurivict commented 7 years ago

I am thinking of the scenario when an attacker would register an overwhelming number of fake hosts, that will send wrong replies to the legitimate hosts so that they will be confused and will fail to find the right nodes.

Is there anything in the the Tox system that prevents such attacks?

nurupo commented 7 years ago

What you are describing is the Sybil attack. There is nothing preventing it in Tox DHT.

yurivict commented 7 years ago

Ok, thanks!

SkyzohKey commented 7 years ago

Reopening that issue as it's still an issue and it needs to be fixed.

nurupo commented 7 years ago

It would be nice to have a solution to this, but Sybil attack is inherent to the most decentralized p2p systems and is not something that is easy to fix. Some proposed solutions to it require having a centralized trusted party that verifies identities of every single node on DHT, which compromizes anonymity of the DHT and makes the centralized trusted party a target for an attack. BitTorrent's DHT is vulnerable to it. Bitcoin seems to restrict the number of peers it makes an out band connection with by IP range, e.g. only one node in /16 subnet, but I'm not sure if this would work for a DHT as you would have Id-space gaps in the routing table, and how would that work with IPv6.

nazar-pc commented 6 years ago

I've recently read the paper Brahms: Byzantine Resilient Random Membership Sampling, you might find it useful too: https://gnunet.org/sites/default/files/Brahms-Comnet-Mar09.pdf As far as I understand it, Brahms is a way to get uniform partial view of the network despite having malicious nodes in it. It claims to withstand against pretty strong adversary that actively poisons node's neighborhood with controlled malicious nodes.

SkyzohKey commented 6 years ago

@nurupo Read that paper, it's fantastic :) @nazar-pc Thanks for the link :)

nbraud commented 6 years ago

@nazar-pc @SkyzohKey Like many BFT protocols, Brahms seems to require f ≤ 1/3 (i.e. the pseudospoofing attack didn't involve more than 1/3 of the total number of nodes):

In contrast, if the fraction of faulty pushes exceeds 1/3, the only fixed point [of the system-wide fraction of faulty ids in local views] is 1.

While most of the paper only assumes 0 < f < 1, the analysis in section 7 is done under the assumption that p = f, and they conclude that if p > 1/3, then the only fixed point is 1 (i.e. they expect 100% of faulty ids in local views).

f < 1/3 is a pretty dangerous assumption in an open system, since it's very easy to bring in many new nodes and break the assumptions (and then break the security properties that the protocol wa supposed to provide).

Caveat Lector: this was a pretty quick read through the paper, and I would definitely appreciate someone having a second look.

TNTBOMBOM commented 4 years ago

I didnt understand what type of DHT Tox is using, is it Kadmelia?

Anyway the solution is within R5N DHT (i think), i suggest to read these papers:

Practical Attacks Against The I2P Network

To read more about it:

Randomized Recursive Routing for Restricted-Route Networks

comparison mentioned on paper talking about GNUnet and Kadmelia:

The GNUnet DHT

zugz commented 4 years ago

I didnt understand what type of DHT Tox is using, is it Kadmelia?

Anyway the solution is within R5N DHT (i think) Randomized Recursive Routing for Restricted-Route Networks

Thanks, that's an interesting paper.

Tox's DHT isn't exactly standard Kademlia, either in operation or in purpose. The primary function of the tox DHT isn't actually to store and retrieve data, but to find a DHT peer's external IP+port so we can directly connect to it if possible, and otherwise to find peers we can connect to who have a connection to the peer, and so can co-ordinate attempts to punch through firewalls by port-guessing. In other words, the DHT acts as a kind of distributed STUN-server (but doing more than in STUN).

To help with this, the eviction strategy isn't Kademlia's; instead we favour the nodes within a given bucket which are XOR-closest to us. This leads to pairs of peers each trying to add the other to a bucket, which is enough to get through certain pairs of NAT types. This should mean that the tox DHT tends to be rather less "restricted" than in the setting of that paper -- though I haven't seen any actual measurements checking this.

We do also have plans to use the DHT to store information in the traditional way. There's a proposal for this at https://github.com/zugz/tox-DHTAnnouncements/blob/master/DHTAnnouncements.md#user-content-announcing-and-searching which uses a mix of the "iterative" and "recursive" techniques -- keeping control of the search with the searcher, but handling inaccessible nodes by setting up routes to forward search requests. Comments on this would be very welcome. Based on a quick examination of R5N, the key advantages of this scheme over R5N are that there's no need to estimate the network size, the packets are smaller (no need for a bloom filter), and less state and traffic is required for reterning responses to requests since we shortcut the routes with direct connections where possible. But for a network which is really highly restricted (with a large majority of nodes behind symmetric NAT, and few behind full cone or no NAT), a recursive procedure like R5N would work better.