Schmavery / raccoon

0 stars 0 forks source link

Handle security concerns #4

Open Schmavery opened 7 years ago

Schmavery commented 7 years ago

There are a number of security concerns that might arise over the course of this project. So far there seem to be at least a couple of known attacks on a DHT. One is a Sybil attack, which I have yet to fully understand, but involves attackers creating multiple fake identities to "outvote" regular users in various consensus-based activities (spam reports, reputation systems, consensus algorithms).

The other is an ID mapping attack, whereby an attacker picks their own ID in the DHT. This allows the attacker to gain control over certain targeted resources that may be near their node. In order to combat this, I believe it's necessary to force the attacker to pick an ID that is out of their control. This id needs to be externally verifiable by other peers so that the attacker can't just lie about their ID. Other peers need to be able to identify another node which is using a false ID and disconnect from it.

Initially I had imagined using a hash of the node's IP as an ID. This is a solution proposed by a number of papers I've read. In this scenario, other nodes can see the attacker's IP address and verify the attacker's ID matches. However, I'm not sure that this is a viable solution when using webrtc, as webrtc communicates using ICE candidates. The following idea was to consider using a hash of the peer's public key as an ID. This was rapidly shot down by the ID Mapping Attack paper below.

Reading material:

  1. ID Mapping Attacks in P2P Networks http://www.davidecerri.it/wp-content/uploads/2016/02/art-id_mapping_attacks-globecom05.pdf

    This explains an attack on a DHT with an unconstrained ID selection mechanism (such as Kademlia, which this paper is based on). It considers the option of using the hash of a public key as an ID as an attempt to defend against ID mapping attacks. It provides fairly convincing evidence that this isn't secure (read the first paper below for more info, it's pretty straightforward). The proposed solution was to constrain the ID by having it be a hash of the IP and port number of the connection.

  2. S/Kademlia: A Practicable Approach Towards Secure Key-Based Routing http://www.tm.uka.de/doc/SKademlia_2007.pdf

    This proposes a solution similar (slightly more complex) to my solution below. The paper admits that this isn't actually a good solution (quoting "Secure routing for structured peer-to-peer overlay networks"[4]), but claims that it is the best that we have.

  3. SybilLimit: A Near-Optimal Social Network Defense Against Sybil Attacks http://www.pittsburgh.intel-research.net/projects/sybil-defenses/p885-yu.pdf

  4. Secure routing for structured peer-to-peer overlay networks https://pdfs.semanticscholar.org/1cfa/21fd43a8154bbb0e01acf7d52d520749444d.pdf

  5. A DDoS Attack by Flooding Normal Control Messages in Kad P2P Networks http://www.icact.org/upload/2012/0156/20120156_finalpaper.pdf

  6. Misusing Kademlia Protocol to Perform DDoS Attacks https://www.researchgate.net/publication/224362886_Misusing_Kademlia_Protocol_to_Perform_DDoS_Attacks

Schmavery commented 7 years ago

The takeaway from this point is that we could technically use a IP-based ID-restriction scheme for a subset of peers

When using a ID verified by proof of work, you always send along your public key and nonce along with your requests as proof of your ID. Then anyone wanting to verify your ID can redo your hash calculations to verify that your nonce is valid, then send you a request encrypted with your public key which you return decrypted to prove you actually own that key.

IP Spoofing:

Outstanding concerns:

We want key generation to be fast enough that during account creation it can be done quickly enough (1 min max ideally), but then it will still take only on the order of n minutes (n = # active nodes) to find a good attacking nonce. Doesn't seem great. Potential workaround would be to perform ID rotation periodically as suggested by paper [1] , though this is a large inefficiency and it's unclear whether we could even rotate fast enough to work (especially since this brute-forcing can be easily parallelized).

Schmavery commented 6 years ago

An idea we had a couple months ago was to only allow nodes that are t+ days old store data, and then rotate the positions of data every t days (or months). If the rotation happened predictably (based on date, for instance), this wouldn't help, but if a random number was somehow chosen by consensus of some sort, then maybe this would make it harder to target a given node to hijack it. Of course, it's still not hard to make nodes so someone could just make more nodes for themselves than exist on the network and wait for t, at which point they can hijack any data they want with high likelihood.

Ideally, instead of time, we let you help with the replication of data but make sure that each piece of data is on at least one "trusted" node. As you store more and more data for the network, you become more "trusted"? Then it will cost the attacker money to operate "trusted" nodes. Something like this seems to be the most likely to work right and there is also research that has gone into this sort of thing: