ipfs / in-web-browsers

Tracking the endeavor towards getting web browsers to natively support IPFS and content-addressing
https://docs.ipfs.tech/how-to/address-ipfs-on-web/
MIT License
349 stars 29 forks source link

TLS for DNSLink websites loaded via public subdomain gateways #169

Closed lidel closed 3 years ago

lidel commented 4 years ago

By default, people will point DNSLink website at their own go-ipfs and set up proper TLS cert for the domain. This is how https://en.wikipedia-on-ipfs.org works.

Problems when loading DNSLink website from a public gateway

There are known problems when someone tries to load DNSLink website from an alternative, public gateway:

Constraints

I think the main criteria for DNSLink representation are:

  1. DNS label representing FQDN with DNSLink should remain human-readable
  2. When loaded in web browser DNSLink website should get proper Origin isolation between content roots
  3. Subdomain URLs should work with any HTTP client

Solving the TLS problem

:point_right: Below is a summary of options I see, would love to hear what others think / do sanity check on this / propose your own.

(A) Mining wildcard certificates on the fly

My understanding is that to support DNSLink in subdomains at our public gateway without this TLS error we would have to create some magical orchestration which "mines certificate" on first load.

While it should be technically possible via some nginx+lua hackery (needs sanity check), it may violate ToS of services like Let's Encrypt. This means we would have to work with CA, make our case and ensure they won't ban us when we do it. If we have green light, we could either document the setup for other gateways to use, or consider implementing the magic it in go-ipfs' gateway itself.

:broken_heart: I don't like this because it brings complexity/centralization of PKI deep into go-ipfs/gateway logic, ale we should avoid that if we can.

(B) Do nothing, live with TLS errors

We could just live with the TLS error, and when someone points it out say "to be independent of centralized PKI you need to Install IPFS Desktop and IPFS Companion".

:broken_heart: Avoiding the problem is bad, but also I worry people won't understand nuance why, and just take a note that "IPFS breaks/ignores web security", which is a pretty bad meme.

(C) Encode FQDN to a string that fits in a single DNS label

This may be the least painful fix so far: if we come up with a way of encoding domain names to something that fits in a single DNS label (max 63 characters), then https://dweb.link/ipns/foo.tld would redirect to https://{encoded-foo-tld}.ipfs.dweb.link and there would be no TLS error (at last for domains shorter than 63 characters).

:green_heart: The encoding step could be added to go-ipfs (just like we convert to base36 if base32 is too long) and that way every gateway would support this with pre-existing TLS certs (no additional setup, seamless upgrade).

Challenges:

bertrandfalguiere commented 4 years ago

What about (C) with and an escape character? For exemple, with - as an escape character, foo-bar.tld becomes foo--bar-tld and foo.bar.tls become foo-bar-tld.

Gozala commented 4 years ago

I have faced this exact dilemma in my lunet experiment and have settled on a following compromise:

Loading https://lunet.link/en.wikipedia-on-ipfs.org does not redirect to anything instead:

  1. Looks up dnslink record for the path en.wikipedia-on-ipfs.org
    • If it maps to /ipns/Qm..hash/ use origin https://Qm..hash.celestal.link
    • If it maps to /ipfs/Qm...hash/ derive origin by hashing /ipns/en.wikipedia-on-ipfs.org e.g drv...hash.celestal.link.
  2. Serve page with a content like <iframe style="width: 100%; height: 100%" src="https://drv..hash.lunet.link" />

Which achieved following:

  1. User would still see nice URLs like https://lunet.link/en.wikipedia-on-ipfs.org.
  2. Every page was origin separated where origin was IPNS public key, with fallback to a domain hash.

    Note: I think public keys are much better option than domain names because

    • It multiple different domains to share the origin
    • To migrate across domain names preserving origin (and storage)
    • Decentralized as in one can generate a key pair
    • Record updates could be signed with a private key to ensure integrity
  3. On could share link that looked and worked as expected.

It had following downsides:

  1. Reflecting URL changes e.g. via pushState required some JS trickery.
  2. curl-ing URL would not give you an actual content, but rather iframe with some celestal.link.

    That could probably be addressed via redirect based on user agent and other headers.

That is all to suggest that I think it's best to decouple UX and Origin separation problems from each other. E.g dweb.link could be used to do the Origin separation and another domain could provide human meaningful URLs on top. I don't believe both could be achieved at the same time.

If I were to choose between some domain escaping logic and use of public keys in it's place I would much rather go with later as in my experience deriving origins from domain names did not really provided good UX, on the contrary it resembled spoofing attacks.

lidel commented 4 years ago

I feel that lunet operated at higher abstraction layers, to which we don't have access here (no keys etc).

I think the main criteria for DNSLink representation are (updated first comment):

  1. DNS label representing FQDN with DNSLink should remain human-readable
    • Frankly, DNSLink already looks like "spoofing attack": http://docs.libp2p.io.ipns.localhost:8080 I don't think http://docs-libp2p-io.ipns.dweb.link would be much worse. I'd say it looks LESS suspicious.
  2. When loaded in web browser DNSLink website should get proper Origin isolation between content roots
  3. Subdomain URLs should work with any HTTP client
    • Solution should work at HTTP level, not HTML or JS
    • Either no iframes or we are forced to do even more user-agent sniffing ( which we hoped to move away from)
      • while https://dweb.link/ipns/dnslink.site.example.com could return HTML with:
        <iframe src=""https://dnslink-site-example-com.ipns.dweb.link` />

Note: DNSlink representation in subdomain on public gateway will be a niche use case. Most of the people will load DNSLink website wither from localhost gateway or original domain.

I am leaning towards keeping this very simple and implementing a variant of (C) which supports encoded domains:

Gozala commented 4 years ago

I feel that lunet operated at higher abstraction layers, to which we don't have access here (no keys etc).

That is fare assessment. I don't think this is in conflict with what I was trying to suggest however. The way I see it https://${derived_origin(domain)}.dweb.link is an equivalent of what lunet used to load inside iframes (that is https://${derive_origin(domain)}.celestal.link).

That higher level abstraction layer could be built on top of this.

I do still however think that it would be highly beneficial if derive_origin function could be extended beyond domain.replace(/./g, '-'). Specifically what I'm suggesting it could do is following:

  1. Read TXT DNS record similar to (or extended version of) dns_link that provides public key to be used in place of domain.replace(/./g, '-'), and return that.
  2. If public key is not present fall back to the substitution based derivation.

That enables:

  1. Site to optionally migrate from domain to domain without loosing user data cache.
  2. Loading https://dweb.link/ipns/${pub_key_for_foo_com} and https://dweb.link/ipns/foo.com would result in the same origin.

But it does makes redirect URLS less readable as in https://foo-com.ipns.dweb.link vs https://Qm...hash.dweb.link, however since providing a public key is optional I think it's reasonable to give domain owner a choice here.

All that said, it may be doing simple char substitution in this iteration and considering future extensions (like mentioned public keys base origins) is best compromise. I did however wanted to point out than in my prior experience I found simple char substitution problematic because it did not play well with IPNS.

lidel commented 4 years ago

Ok, I am leaning towards shipping support for simple substitution in go-ipfs ~0.9: https://dweb.link/ipns/my.v-long.example.comhttps://my-v--long-example-com.ipns.dweb.link

Immediate need is to make subdomain gateways drop-in replacement for all paths without surprises or need for custom TLS setup (including ones with DNSLink name).

Extending DNSLink with optional pubkey is exciting, but requires more analysis and I feel we need to park it until we have time to work on IPNS itself.

eminence commented 4 years ago

I'm happy to see progress on this!

One quick note:

Cloudflare IPFS gateways actually have good support for this. They have a very simple web form (scroll down to the bottom of the page) where you can enter in your domain name, and they'll quickly generate the necessary certificate, allowing you to CNAME your domain to www.cloudflare-ipfs.com and get DNSLink + TLS + Custom domain.

This is basically a variant of solution (A) above, but doesn't involve any "on-the-fly" trickery. I think cloudflare has a big advantage here, though, since they have their own CA to use.

The only thing preventing me from using this successfully is that I get very poor performance with the cloudflare gateway (which honestly is a little surprising given what cloudflare's main business is).

lidel commented 4 years ago

@eminence it is indeed nice, but relies on a single CA, which is pretty bad decentralization-wise. I believe (C) will solve it for everyone, and if you wish to have "nicer" name then you still can use cloudflare hack.

benhylau commented 3 years ago

Also hope to see this shipped soon. To me I think @lidel's approach C sounds reasonable:

https://dweb.link/ipns/my.v-long.example.comhttps://my-v--long-example-com.ipns.dweb.link

but I also prefer hashing to a CID than the label manipulation, but this obviously loses the human readability:

https://dweb.link/ipns/my.v-long.example.comhttps://{cid}.ipns.dweb.link

lidel commented 3 years ago

PR draft with (C) triggered via X-Forwarded-Proto: https is at https://github.com/ipfs/go-ipfs/pull/7847 Will finish it and try to fast-track it next week, as we need it for our external collaborations.

lidel commented 3 years ago

We're running https://github.com/ipfs/go-ipfs/pull/7847 on dweb.link and the feature is go-ipfs master and scheduled to be included in go-ipfs 0.8.0-rc2 next week – please test and comment if any issues :pray:

Demos: