ipfs / notes

IPFS Collaborative Notebook for Research
MIT License
402 stars 30 forks source link

Extending IPNS to optionally resolve to plain HTTPS hosts through TLS certificate PK hashes and handshake verification #181

Open rotemdan opened 7 years ago

rotemdan commented 7 years ago

It seems like IPNS could, in theory, provide a rudimentary alternative for the DNS and CA infrastructure, and act as a complement to the plain TLS handshake and HTTP protocol stack (though with some minor modifications):

  1. User creates a self-signed TLS certificate and loads it to an HTTPS server (note: the certificate doesn't have to be associated with any particular 'name' or 'domain' of the conventional form).
  2. The certificates' public key is hashed and then used as an IPNS name.
  3. The resulting IPNS name resolves to an HTTPS host URL. E.g. https://172.217.21.238:443
  4. When the newly defined name association is first transmitted to an IPFS DHT node, it tries to establish a TLS connection to the target address, hashes and verifies the public key within the certificate given during the TLS handshake and then closes the connection. This is only done once.
  5. The public key used by the host would also be verified and hashed at every request made by a client. Otherwise, it would then proceed as a normal HTTPS connection and can use any HTTP protocol stack and even support HTTP/2 as well (or theoretically any protocol that runs on top of TLS).

My question is: has this ever been considered as a possible feature of IPFS/IPNS, in some form? are there any special security implications that need to be noted?

I'm aware of the issue of multiple valid hosts sharing the same public key (though it seems possible an elegant solution for that could be found). Perhaps more generally I'll ask what are your current views on this direction - of providing a gateway to the more conventional 'mutable' web through the distributed name system (technical issues aside)?

Edit:

Some points of clarification:

  1. There are two different layers of verification for a hash/host combination. First, the given host address (ip/port combination) is signed by the submitter, so that's the first layer. The second layer is that the host itself is contacted and a TLS handshake performed with it, where the expected PK in the given certificate should match the public key given by the submitter of the IPNS entry (there might be some slight technicalities for securing the signatures for the host address but I'm leaving that for the moment).
  2. The certificate doesn't have to be self-signed, and it can actually point to any 'normal' DNS domain. The IPNS system simply doesn't care about what it points to, as long as its public key hashes to the expected value and the TLS handshake to the host succeeds. This means that any 'normal' website can also be submitted to IPNS by its owner.

Edit2: some corrections

rotemdan commented 7 years ago

Some preliminary notes:

(1) In terms of the protocol, having an IPNS locator like /ipns/QmPXME1oRtoT627YKaDPDQ3PwA8tdP9rWuAAweLzqSwAWT/some/path/randomcat.jpg to resolve to an HTTPS URL like https://172.217.21.238:443/some/path/randomcat.jpg would not seem to violate most high-level client assumptions about the locator:

(2) Based on my superficial analysis so far, it seems that using IPNS to point to a concrete IP host this way would be at least as secure as when IPNS to point to an immutable IPFS resource. It is only that 'content-defined' IPFS locator is replaced by a 'certificate-defined' (or 'public key-defined') concrete host locator.

(3) I've read several discussions about how to serve dynamically generated content over IPFS/IPNS. The suggested methods seemed a bit on the complex side. This proposal could provide a solution that greatly simplifies it while avoiding falling back to the more 'traditional' DNS system.

cc @jbenet @whyrusleeping - would love to hear your comments on this.

rotemdan commented 7 years ago

Trying to tackle some technical and other misc issues:

What about when the certificate/PK is changed by the author (I mean - not necessarily due to a compromised key but say, for periodic renewal or for extra security reasons)? will the locator be broken forever? Solution: A new IPNS entry would be created and the old one could modified to point to it, instead of the concrete host locator. Basically, instead of pointing to an HTTPS host, the old PK hash would now point to the new PK hash by resolving to a secondary IPNS locator instead of a host. If the new hash is verified and the author has proved of owning the two public keys by providing the correct signatures then the entry is considered valid (this redirection can happen several times - though only one is actually needed - if the owner kept all previous key pairs they can update all of them to point to the most recent one).

What about multiple IPs per PK? failover? round robin? random host selection? location based resolution? This needs to be investigated further.

Why limit just to the proven owner? why not allow everyone to submit these types of IPNS entries as long as they are verifiable through the handshake? There is no fundamental reasons why not actually. Technically a bot could actually go and scan entire DNS registries, contact the hosts, read the certificate's PK and automatically submit it to IPNS to eventually index significant parts of the public web and also update the entries when the addresses and certificates change.

The problem has more to do with ensuring good quality, up-to-date records. There is also a problem of opening further vulnerability to various forms of DDoS and other attacks (also: I'm not sure about how redirection could be safely done in this case).

(I'm assuming a unique PK per individual site here, which is not necessarily a valid assumption though. Edit: after further real-world surveying it seems like it is actually very common that a certificate is shared between 20+ domains and subdomains.. so I guess sadly the automated approach wouldn't work, for the most part.. the publisher would need to be required to use a unique key pair for each individual publication)

rotemdan commented 7 years ago

OK, it seems like the DHT node handshake verification could lead to a 'botnet' like DDoS attack vulnerability. I'll explain it and try to give ideas of how it can be avoided:

  1. An attacker creates many random key pairs.
  2. The attacker chooses to attack a particular host IP. Say 172.217.21.238:443
  3. The attacker submits thousands of bogus IPNS entries to the network that point to that address.
  4. The receiving nodes try to validate that address and create thousands of simultaneous connections to that address.
  5. Despite the fact that the verifications fail, the host is overloaded by many connections from different origins and becomes temporarily unreachable.

Solution 1: the DHT could just unconditionally store the entries as long as the internal signature is correct. If a bogus entry is submitted (though the signature is valid), it would just accept it. In a sense this is not any different than having bogus IPNS entries that point to IPFS locators with random, meaningless hashes.

Redirection could still be securely done, I believe. If the submitter proves ownership of both private keys it doesn't really matter if the newly pointed host is bogus or not.

The question that needs to be asked now is whether the usage of bogus IPNS-to-HTTPS locators could still be used for DDoS attacks somehow. I still need to think about that.

Solution 2: The nodes would defer the verification to a random time within a time span of say, a day. This would allow bogus entries to be eventually identified with a lesser risk of overloading the host.

rotemdan commented 7 years ago

Trying to be less technical for a bit. I'll try to summarize why I think IPNS to HTTPS resolution is important:

The system I'm trying to figure out here could be seen as an independent separate project. I decided to suggest it here because I feel there's really no need for fragmentation and duplication. You've already built and are committed to further improving an infrastructure for efficient distributed lookups that does something very close and may work as well for this purpose.

rotemdan commented 7 years ago

I wanted to clarify a couple of things here, especially about certificates:

Differently from the way TLS/HTTPS connections are commonly measured as 'secure'. The hash based naming system I'm suggesting here doesn't particularly care about certificates at all, other than the public key they contain (it could technically even accept expired or unsigned certificates). It does not try or claim to ensure the identity of the host. It only uses the PK in the certificate to 'piggyback' over an existing protocol while only using a subset of it.

So how can this be truly 'secure'? why would anyone use this?

Let's say someone sends you a link that looks like this:

/ipns/QmPXME1oRtoT627YKaDPDQ3PwA8tdP9rWuAAweLzqSwAWT/ReputableSeller/BuyExpensiveProduct

And tells you this a real Amazon link and you can safely buy the product. You open the link, and it looks like a real Amazon product page.

The browser tells you the connection is secure, but doesn't show you any clear domain name, certificate details, trusted signatures, or ownership information (or even explicitly informs you you the publisher is 'anonymous' and may not be trustworthy).

Why would you trust this? sure, the connection is 'secure' in the sense that it is protected from MITM and impostor attacks, and you can be assured that it connects to the host it was truly intended to, and (hopefully) no one can eavesdrop on it, but it doesn't really guarantee anything else.

So again you might ask? this sounds really bad, how is this useful?

It is bad, and probably should never be used for applications like e-commerce or banking, that require high levels of identity information and trust on a foreign entity, unless backed by an actual, high quality certificate (or some other reliable form of identity verification). This doesn't, however, mean that it is completely useless, far from it.

Another way to look at it: why would anyone use IPNS [to IPFS resolution]? since it has the same PK hash-based security model. Why would you ever touch it?

Because it relies on a different security model that would truly 'shine' on a different set of scenarios, especially ones where signed identity information is not particularly needed and the source that has created or published the locator is trusted or personally known.

Here are some example use cases:

  1. I've got a fast business connection and I have 200 industrial devices and 1 IPv4 address. I want to run a publicly accessible secure http server on each one of them (different public port for each one). As a security requirement, I must use a unique key pair for each device. How do I do it?.
  2. I have a smart lamp, refrigerator, toaster and television in my home. They are all WiFi connected devices. I want to run a secure, publicly accessible, TLS listener on each one of them without involving a third party service, registering any domain or acquiring certificates. How do I do it?
  3. I'm running a web service in my personal computer that connects to my personal financial data and reads from a local database. I want to quickly set up a secure address to access it from anywhere without relying on centralized infrastructure. How do I do it?
  4. I want to set up a secure address (or even better - the software would be able set the address automatically) to globally access my own/my work's/my friends' media server, NAS, private cloud, router, remote desktop, VPN etc.
  5. I want to set up a direct way to securely connect to my smartphone, tablet, or connected car. However, they are constantly roaming between different mobile and WiFi (some of them public) networks, and sometimes aren't even reachable at all, so using a permanent IP is not viable (and wouldn't be safe by its own anyway). I'm not interested in using or paying for a third party dynamic DNS service like DynDNS. ...

Amazingly there are no simple solutions for these scenarios today! using the IP alone (if even possible) is not secure and leaves the connection vulnerable to MITM and impostor attacks! (also - you need a permanent IP, which is not necessarily cheap or available to the average consumer). Sure the software can use an ad-hoc trick to 'store' the PK at the first request and validate it on the next ones (though this feature is not usually available in common software with the exception of some VPN utilities). However, even with that trick it cannot ensure that the first request it made was indeed to the host it was supposed to!

Edits: some major rephrasing and expansions

rotemdan commented 7 years ago

[Note: the previous comment was heavily edited and improved since it was first published, so please re-read it before reading this one!]

Making the 'best' of both worlds: can certificates co-exist with hashes?

There is nothing preventing the hash based location system to support 'traditional' certificates as well (after all, it is a native part of the protocol). A main obstacle is what the user would see at the location bar:

/ipns/QmPXME1oRtoT627YKaDPDQ3PwA8tdP9rWuAAweLzqSwAWT/ReputableSeller/BuyExpensiveProduct

or perhaps when more adapted to the browser (when natively supported), something like:

https://QmPXME1oRtoT627YKaDPDQ3Pw.ipns/ReputableSeller/BuyExpensiveProduct

What if this really is an amazon published page? Let's say the site provided a valid, signed certificate for the real 'Amazon, Inc.'. Could the location bar be somehow 'rewritten' to:

https://(Amazon, Inc.)/ReputableSeller/BuyExpensiveProduct

(I mean when viewed it would overlay this simplified location, but when edited it would switch the original, hash based one)

or perhaps just display the publisher details clearly but separately, like is already done on modern browsers for secure connections?

(Amazon, Inc.) https://QmPXME1oRtoT627YKaDPDQ3Pw.ipns/ReputableSeller/BuyExpensiveProduct

Will this 'feel' secure enough? would it really be a good idea? Maybe? who knows?

I hope I'm starting to get to you people.. whoever is reading this.. perhaps this all appears completely obvious and I'm just 'preaching to the choir' here? (then please tell me!).

Anyway, I'm getting really excited about this. I want this to happen! now! seriously.. it would be just 'embarrassingly' useful! would love to hear your comments! If you think this is just a 'terrible' idea then please let me know! I'll try to convince you otherwise..

edits: more ideas edit: changed domain to publisher name in location examples edit: changed hypothetical IPNS psuedo-domain format to something that parses better

jeasonstudio commented 6 years ago

Awesome.