Open rotemdan opened 8 years ago
Some preliminary notes:
(1) In terms of the protocol, having an IPNS locator like /ipns/QmPXME1oRtoT627YKaDPDQ3PwA8tdP9rWuAAweLzqSwAWT/some/path/randomcat.jpg
to resolve to an HTTPS URL like https://172.217.21.238:443/some/path/randomcat.jpg
would not seem to violate most high-level client assumptions about the locator:
(2) Based on my superficial analysis so far, it seems that using IPNS to point to a concrete IP host this way would be at least as secure as when IPNS to point to an immutable IPFS resource. It is only that 'content-defined' IPFS locator is replaced by a 'certificate-defined' (or 'public key-defined') concrete host locator.
(3) I've read several discussions about how to serve dynamically generated content over IPFS/IPNS. The suggested methods seemed a bit on the complex side. This proposal could provide a solution that greatly simplifies it while avoiding falling back to the more 'traditional' DNS system.
cc @jbenet @whyrusleeping - would love to hear your comments on this.
Trying to tackle some technical and other misc issues:
What about when the certificate/PK is changed by the author (I mean - not necessarily due to a compromised key but say, for periodic renewal or for extra security reasons)? will the locator be broken forever? Solution: A new IPNS entry would be created and the old one could modified to point to it, instead of the concrete host locator. Basically, instead of pointing to an HTTPS host, the old PK hash would now point to the new PK hash by resolving to a secondary IPNS locator instead of a host. If the new hash is verified and the author has proved of owning the two public keys by providing the correct signatures then the entry is considered valid (this redirection can happen several times - though only one is actually needed - if the owner kept all previous key pairs they can update all of them to point to the most recent one).
What about multiple IPs per PK? failover? round robin? random host selection? location based resolution? This needs to be investigated further.
Why limit just to the proven owner? why not allow everyone to submit these types of IPNS entries as long as they are verifiable through the handshake? There is no fundamental reasons why not actually. Technically a bot could actually go and scan entire DNS registries, contact the hosts, read the certificate's PK and automatically submit it to IPNS to eventually index significant parts of the public web and also update the entries when the addresses and certificates change.
The problem has more to do with ensuring good quality, up-to-date records. There is also a problem of opening further vulnerability to various forms of DDoS and other attacks (also: I'm not sure about how redirection could be safely done in this case).
(I'm assuming a unique PK per individual site here, which is not necessarily a valid assumption though. Edit: after further real-world surveying it seems like it is actually very common that a certificate is shared between 20+ domains and subdomains.. so I guess sadly the automated approach wouldn't work, for the most part.. the publisher would need to be required to use a unique key pair for each individual publication)
OK, it seems like the DHT node handshake verification could lead to a 'botnet' like DDoS attack vulnerability. I'll explain it and try to give ideas of how it can be avoided:
Solution 1: the DHT could just unconditionally store the entries as long as the internal signature is correct. If a bogus entry is submitted (though the signature is valid), it would just accept it. In a sense this is not any different than having bogus IPNS entries that point to IPFS locators with random, meaningless hashes.
Redirection could still be securely done, I believe. If the submitter proves ownership of both private keys it doesn't really matter if the newly pointed host is bogus or not.
The question that needs to be asked now is whether the usage of bogus IPNS-to-HTTPS locators could still be used for DDoS attacks somehow. I still need to think about that.
Solution 2: The nodes would defer the verification to a random time within a time span of say, a day. This would allow bogus entries to be eventually identified with a lesser risk of overloading the host.
Trying to be less technical for a bit. I'll try to summarize why I think IPNS to HTTPS resolution is important:
The system I'm trying to figure out here could be seen as an independent separate project. I decided to suggest it here because I feel there's really no need for fragmentation and duplication. You've already built and are committed to further improving an infrastructure for efficient distributed lookups that does something very close and may work as well for this purpose.
I wanted to clarify a couple of things here, especially about certificates:
Differently from the way TLS/HTTPS connections are commonly measured as 'secure'. The hash based naming system I'm suggesting here doesn't particularly care about certificates at all, other than the public key they contain (it could technically even accept expired or unsigned certificates). It does not try or claim to ensure the identity of the host. It only uses the PK in the certificate to 'piggyback' over an existing protocol while only using a subset of it.
So how can this be truly 'secure'? why would anyone use this?
Let's say someone sends you a link that looks like this:
/ipns/QmPXME1oRtoT627YKaDPDQ3PwA8tdP9rWuAAweLzqSwAWT/ReputableSeller/BuyExpensiveProduct
And tells you this a real Amazon link and you can safely buy the product. You open the link, and it looks like a real Amazon product page.
The browser tells you the connection is secure, but doesn't show you any clear domain name, certificate details, trusted signatures, or ownership information (or even explicitly informs you you the publisher is 'anonymous' and may not be trustworthy).
Why would you trust this? sure, the connection is 'secure' in the sense that it is protected from MITM and impostor attacks, and you can be assured that it connects to the host it was truly intended to, and (hopefully) no one can eavesdrop on it, but it doesn't really guarantee anything else.
So again you might ask? this sounds really bad, how is this useful?
It is bad, and probably should never be used for applications like e-commerce or banking, that require high levels of identity information and trust on a foreign entity, unless backed by an actual, high quality certificate (or some other reliable form of identity verification). This doesn't, however, mean that it is completely useless, far from it.
Another way to look at it: why would anyone use IPNS [to IPFS resolution]? since it has the same PK hash-based security model. Why would you ever touch it?
Because it relies on a different security model that would truly 'shine' on a different set of scenarios, especially ones where signed identity information is not particularly needed and the source that has created or published the locator is trusted or personally known.
Here are some example use cases:
Amazingly there are no simple solutions for these scenarios today! using the IP alone (if even possible) is not secure and leaves the connection vulnerable to MITM and impostor attacks! (also - you need a permanent IP, which is not necessarily cheap or available to the average consumer). Sure the software can use an ad-hoc trick to 'store' the PK at the first request and validate it on the next ones (though this feature is not usually available in common software with the exception of some VPN utilities). However, even with that trick it cannot ensure that the first request it made was indeed to the host it was supposed to!
Edits: some major rephrasing and expansions
[Note: the previous comment was heavily edited and improved since it was first published, so please re-read it before reading this one!]
There is nothing preventing the hash based location system to support 'traditional' certificates as well (after all, it is a native part of the protocol). A main obstacle is what the user would see at the location bar:
/ipns/QmPXME1oRtoT627YKaDPDQ3PwA8tdP9rWuAAweLzqSwAWT/ReputableSeller/BuyExpensiveProduct
or perhaps when more adapted to the browser (when natively supported), something like:
https://QmPXME1oRtoT627YKaDPDQ3Pw.ipns/ReputableSeller/BuyExpensiveProduct
What if this really is an amazon published page? Let's say the site provided a valid, signed certificate for the real 'Amazon, Inc.'. Could the location bar be somehow 'rewritten' to:
https://(Amazon, Inc.)/ReputableSeller/BuyExpensiveProduct
(I mean when viewed it would overlay this simplified location, but when edited it would switch the original, hash based one)
or perhaps just display the publisher details clearly but separately, like is already done on modern browsers for secure connections?
(Amazon, Inc.) https://QmPXME1oRtoT627YKaDPDQ3Pw.ipns/ReputableSeller/BuyExpensiveProduct
Will this 'feel' secure enough? would it really be a good idea? Maybe? who knows?
I hope I'm starting to get to you people.. whoever is reading this.. perhaps this all appears completely obvious and I'm just 'preaching to the choir' here? (then please tell me!).
Anyway, I'm getting really excited about this. I want this to happen! now! seriously.. it would be just 'embarrassingly' useful! would love to hear your comments! If you think this is just a 'terrible' idea then please let me know! I'll try to convince you otherwise..
edits: more ideas edit: changed domain to publisher name in location examples edit: changed hypothetical IPNS psuedo-domain format to something that parses better
Awesome.
It seems like IPNS could, in theory, provide a rudimentary alternative for the DNS and CA infrastructure, and act as a complement to the plain TLS handshake and HTTP protocol stack (though with some minor modifications):
My question is: has this ever been considered as a possible feature of IPFS/IPNS, in some form? are there any special security implications that need to be noted?
I'm aware of the issue of multiple valid hosts sharing the same public key (though it seems possible an elegant solution for that could be found). Perhaps more generally I'll ask what are your current views on this direction - of providing a gateway to the more conventional 'mutable' web through the distributed name system (technical issues aside)?
Edit:
Some points of clarification:
Edit2: some corrections