DEP-0005: Move DNS TXT record to a dedicated subdomain

da2x commented 6 years ago

The current DEP-0005 proposal is unusable with CNAME records. Imagine the following zone file:

www.example.com. IN CNAME 3600 cdn-service-or-something.example.net

It’s impossible to add additional TXT records to the www subdomain in the above setup. You can’t add any other records to a name that has been “redirected”/outsorced with CNAMEs. IPFS has solved this by using a dedicated subdomain for DNS-based discovery of IPFS hashes. E.g.

_dnslink.www.example.com. IN TXT 3600 "dnslink=/ipfs/{hash}"

I propose a small change to DEP-0005 to address this and align the mechanism with RFC 6763: DNS-Based Service Discovery. The change would be to deprecate the current draft proposal of adding TXT records to the named zone, and instead add it to a DNS-SD subdomain (which is compatible with CNAME deployments):

_dweb._udp.www.example.com. IN TXT 3600 "datkey={key}"

—and I do mean that the current method should be deprecated to avoid a future were multiple DNS lookups are required to discover DAT keys. IPFS support both which leads to unnecessary DNS lookups. References to it should be removed from all documentation and support dropped in Beaker Browser after a year or so.

Why _dweb._udp.? it’s conceivable that other protocols would want to use DNS to auto-discover distributed web tools. _dweb is short and generic enough to allow for other uses, and thus increasing the likelihood that the record will be cached somewhere nearer to the end-user in the DNS hierarchy. E.g. IPFS could use the same subdomain with an ipfskey record. Using that argument, Dat should consider using _dnslink. like IPFS does. I’m against that because it’s not compliant with the well-established and well supported RFC 6763. Also, the name dnslink is redundant and meaningless.

The second subdomain ,_udp, is a common name for DNS-SD defined in RFC 6763 that allows for all service discovery methods to be delegated to another delegated to a secondary DNS service. The name _udp should have been _srv (“service”) but it’s _udp is for legacy reasons. See RFC 6763 section 7. for the details.

Technically, SRV (service discovery) records should be used instead of TXT records.

_dweb._udp.www.example.com. IN SRV 3600 "datkey={key}"

However, I don’t have any data on how widely they’re supported in managed DNS solution. I believe they should be supported everywhere except the most outdated and insecure cPanel instances and legacy systems. Using SRV instead of TXT has a potential performance benefit (see RFC 6963 section 12.2) for any discovery mechanism that requires further DNS requests to use a discovered service (such as IPFS+IPNS+DNSLink). It would potentially be beneficial to other dweb solution to stick with SRV type records even though Dat doesn’t use it itself at this time.

pfrazee commented 6 years ago

We also support the /.well-known/dat solution over HTTPS, does that make a feasible alternative?

da2x commented 6 years ago

We also support the /.well-known/dat solution over HTTPS, does that make a feasible alternative?

Performance-wise, DNS is better suited for the job (UDP versus TCP+TLS negotiations, downstream response caching, etc.) Best practises for DNS based service discovery is detailed in RFC 6763, and they weren’t followed for DEP-0005. DEP-0005 still has “draft” status so the preferred solution is to fix the spec before there are too many implementations and deployments to deal with, right?

pfrazee commented 6 years ago

Well both of the solutions in DEP-0005 are stopgaps prior to a preferred solution of having a new DNS record type. I'm not particularly inclined to change our solution for this edgecase, especially because I think what you're proposing is not as straight-forward to end-users.

da2x commented 6 years ago

Why would there be a new DNS record type for this? This type of service-discovery is exactly what the SRV record type is for. Introducing a new record type doesn’t solve the original problem here, though. Web sites on HTTP are often delivered through content distribution networks (CDNs), and those use CNAMEs. You can’t add a new record type to a CNAME’d domain any more than you can add a TXT record to one.

da2x commented 6 years ago

10 million domains are served by Cloudflare alone. That is 10 million domains with CNAMEs on their primary user facing domain that can’t use DEP-0005. These are just the customers of a single service provider.

There are really good reasons behind the decisions in RFC 6963. Please don’t discard them so easily.

pfrazee commented 6 years ago

This type of service-discovery is exactly what the SRV record type is for.

We may have missed that. We should look into it.

There are really good reasons behind the decisions in RFC 6963. Please don’t discard them so easily.

Please don't be upset! You're asking for a large change and I just don't want to leap into it unless I'm sure we must.

10 million domains are served by Cloudflare alone. That is 10 million domains with CNAMEs on their primary user facing domain that can’t use DEP-0005. These are just the customers of a single service provider.

The thing is that while this DEP is a draft, it is deployed so switching has a cost. If .well-known/dat is suitable for the CNAME use-cases, we should consider whether the switching cost is worth it. If the well-known won't solve the situation for CNAME users, then we won't have a choice.

da2x commented 6 years ago

The thing is that while this DEP is a draft, it is deployed so switching has a cost.

It’s also in draft to allow for feedback and changes when issues are identified. Draft specifications normally have huge warnings near the top warning people that they’re not stable, and that implementations are encouraged and that they should provide feedback and be prepared for change.

You're asking for a large change and I just don't want to leap into it unless I'm sure we must. […] I'm not particularly inclined to change our solution for this edgecase

I believe were we disagree here is that this isn’t an edge case.

I queried the entire Alexa Top 1 Million list and found that this issue affects 46,3 % of the top one million domains. That is to say, nearly half of all popular websites rely on CNAMEs to some extent for their www. subdomain.

It will be more confusing to people when the DNS discovery method doesn’t work on half the web. The DNS discovery method also shouldn’t be the preferred resolution method when it’s known not to work on nearly to half the web.

If the well-known won't solve the situation for CNAME users, then we won't have a choice.

It will work, but it does require a web server (something the DNS method doesn’t). It’ll also have reduced performance compared to the DNS method due to the protocol overhead.

Why even have support for the current DNS method when the spec means it’s unworkable for so many? The DNS-SD spec solves these issues (and others) but does require the use of dedicated named DNS records. The IETF doesn’t just make up unnecessarily complicated specs without good reason. They usually aim to make it as simple as possible to deter the introduction of implementation bugs. Some things, like DNS, are quite complicated, though.

Please don't be upset!

This issue just complicated things considerably for my planed IPFS and Dat deployment. I didn’t realize Dat service discovery over DNS would be an issue at all. And I haven’t slept and I may be a bit grumpy. 😛 Sorry about that.

*I took the Alexa Top 1M list, prepended www., and looked up CNAME records for the entire set. You can’t have a CNAME on the bare-domain, so I used the most common subdomain www. for the survey. This only shows how common CNAMEs are in deployments and not whether websites actually use the www. subdomain or just their bare-domain. Excluding domains where the www. CNAME point to the bare-domain still leaves 16,8 % (and almost 30 % of the top 100 000). Regardless, the number of websites that can’t use the DNS record method is clearly unacceptably high. You can get a copy of the raw survey data here [text/csv].

da2x commented 6 years ago

By the by, I also ran through the Alexa Top 1 Million (bare and www. subdomain) and only found three domains with datkey records:

alexa_rank, domain, datkey_txt_record
440890, normadesign.it, "datkey=ceb9705ffedd458724e20c0059226a7a897b75442ef583fc51d4529c94222ef9"
440890, www.normadesign.it, "datkey=ceb9705ffedd458724e20c0059226a7a897b75442ef583fc51d4529c94222ef9"
501665, beakerbrowser.com, "DATKEY=87ed2e3b160f261a032af03921a3bd09227d0a4cde73466c17114816cae43336"

pfrazee commented 6 years ago

We discussed this in the Dat WG. We want to explore .well-known/dat as the solution for people with the CNAME issue. Other than the performance cost and the HTTPS server requirement, are there any other reasons not to use that solution?

da2x commented 6 years ago

Other than the performance cost and the HTTPS server requirement, are there any other reasons not to use that solution?

Isn’t that enough? The main problem is that the draft spec is broken by design. There is time to fix it before too many clients and websites implement it.

The thing is that while this DEP is a draft, it is deployed so switching has a cost.

At the moment the cost of fixing the draft spec is really low. I only found one domain in addition to Beaker’s own domain. Anyone who implements a draft spec should know to expect that things may change before it’s finalized.

I see three possible paths:

Fix the spec and have something that works for everyone. This increases the complexity of configuring DNS a little.
Deprecate the DNS TXT record entirely and have everyone use well-known method. This reduces performance considerably, adds setup and maintenance complexity, and requires a traditional web server. There will be one method that works for everyone.
Leave things as they are. The preferred and fast method won’t work for many websites and clients will be doing pointless DNS requests that are known not to work for many websites.

Fixing the draft spec is clearly the most preferable option.

pfrazee commented 6 years ago

The main problem is that the draft spec is broken by design.

Please respect the positions of other people during a disagreement. We're discussing tradeoffs and it's clear that there's no consensus between us that the spec is broken. To assert otherwise is rude.

The Dat Working Group discussed this issue at the last meeting and members independently agreed that we'd prefer not to make the changes you've suggested unless we have to. If you want to convince us that a change is necessary, you need to advance a compelling argument against using the well-known solution in the case when a CNAME is blocking the DNS solution.

dnebdal commented 6 years ago

Hi, and sorry for jumping into the middle of this. I've read through this a couple of times now, and ... I don't get it.

You have a spec proposal that has made a small mistake that blocks out almost half the domains out there. The suggested solution is trivial, brings you in line with a relevant, existing standard, and should make it easier to integrate with things like mDNS. The only things broken by changing it now are the domains of a single-digit number of enthusiastic early adopters who know they're using a non-final spec.

Why wouldn't you fix this now? You're clearly hoping for large-scale adoption, and this is the sort of design flaw that'll annoy implementers for decades if you leave it alone. Having a completely separate fallback mechanism is great, and I have no arguments against the .well-known solution - but that's not an argument against fixing this now.

pfrazee commented 6 years ago

@dnebdal My resistance to the spec change isn't just the switching cost. I'm not super excited about the aesthetics and UX of prepending special subdomains. Given that the .well-known solution is viable, I feel it's feasible to include aesthetics and UX in the consideration. I won't speak for other WG members, but I will say that the same view was voiced by others during the meeting.

I'm not against considering this change, I'm just not excited about it, which is why I'm asking for a more convincing case against using .well-known as the solution.

mafintosh commented 6 years ago

I can see both sides of this.

On one hand I like the aesthetic and "first-class citizen"ness of the current DNS record. On the other hand I understand the complexities this can create described above.

For me, personally, setting up DNS records are somewhat easier that dealing with the .well-known stuff most of the time, since most DNS providers have simple interfaces you can use to configure this stuff vs having to write a file on a HTTP server.

pfrazee commented 6 years ago

For me, personally, setting up DNS records are somewhat easier that dealing with the .well-known stuff most of the time, since most DNS providers have simple interfaces you can use to configure this stuff vs having to write a file on a HTTP server.

That's a fair point. I would assume if you're using a CDN then you're probably putting that in front of a server which you could write the file to, but that may not always be the case.

da2x commented 6 years ago

I'm not against considering this change, I'm just not excited about it, which is why I'm asking for a more convincing case against using .well-known as the solution.

It takes one network roundtrip over DNS. The DNS server can generally be assumed to be network topographically close to the user as most get DNS from their ISP or possibly a provider who’ve setup shop at their local NIX. It’s as fast as it gets.

It takes at the very least four roundtrips over HTTPS (this is just the protocol level and not even including the TCP overhead). This is assuming every single best practices is followed. The real number of back-and-forths for most servers is probably six plus the TCP overhead. The server can be anywhere on or off world.

DNS can also be more reliable. It’s much cheaper and easier for publishers to to setup multiple name servers to get redundancy for their DNS availability than it is to do the same for HTTPS.

DNS-over-TLS or DNS-over-HTTPS doesn’t really change things here as the local resolver can be assumed to have an open connection to the DNS provider. (Cold starts are a possibility, but now we’re talking edgecases.)

I'm not super excited about the aesthetics and UX of prepending special subdomains.

No one sees these DNS records, though. A publisher sees them once while they set it up for their domain and that is it. It’s much more important to have a reliable and fast discovery method that works for everyone.

You can improve on UX by creation online test and debugging tools so people can check that they’ve set it up correctly. (Like what Let’s Debug does for the otherwise complicated and potentially error-prune process of obtaining certificates for Let’s Encrypt.)

You can get a prettier compromise by following IPFS and dropping the protocol subdomain (_udp.) entirely. This technically works just fine, though it makes it harder to delegate all service discovery requests to a dedicated authoritative server. This part of RFC 6763 is mostly important at scale and for mDNS in larger organizations. Here are some single-subdomain suggestions that will work just fine:

_dweb.www.example.com. IN TXT 3600 "datkey={key}"
_dat-protocol.www.example.com. IN SRV 3600 "datkey={key}"

The underscore prefix is used by convention for domains were you don’t expect to find A/AAAA records; such as the special domains used for service-discovery. It’s not a requirement of RFC 6763. It often group these types of records together with other service-discovery records in many DNS admin tools, though (those that sort by zone name, at least).

I would assume if you're using a CDN then you're probably putting that in front of a server which you could write the file to, but that may not always be the case.

That’s the case for a pull-CDNs (basically works like your everyday reverse proxy) but not push-CDNs (acts more like a FTP server where you deposit files, and is usually way cheaper). The latter is mostly used for larger files like videos and such which really could benefit from some peer-to-peer love.

pfrazee commented 6 years ago

@da2x okay appreciate that write-up. I've been talking about this with @mafintosh and we're going to give it another look.

No one sees these DNS records, though. A publisher sees them once while they set it up for their domain and that is it.

Not to litigate further, but to clarify the nature of my position: A big part of what we're trying to accomplish is turning more non-technical people into publishers, so I consider the UX of setting up a domain name to be relatively important. You may be right that the performance matters more.

da2x commented 6 years ago

Here are some more reasons for fixing the DNS discovery method:

An ISP/country-level firewall can block the HTTPS server IP and stop .well-known discovery from working. Users can use one of a million different DNS server to use the DNS discovery method. It’s way easier to route around censorship with DNS because there are so many [free] providers.
Websites go offline for all sorts of reasons. Earth quake, comets, legal, billing-email-got-lost, bus factor, server park switch explosions, DDoS, etc. A domain can be setup to replicate their DNS zone on a dozen different DNS hosting services and have spectacular availability of their DNS. There are tons of low-cost or even free providers that will keep hosting your DNS for free (provided the domain registration itself doesn’t expire.)

da2x commented 5 years ago

I’ve just completed a scan of the top 2,4 million websites and domains on the web. I only found ten Dat websites, all of which used the Well Known URI mechanism. Only beakerbrowser.com has a DATKEY record. In other words, it’s very doable to change the specification without causing any breakage.

jedahan commented 5 years ago

Making the lookup compatible with DNS-SD would help the use case of routerless node discovery.

In particular, I have been building a network that uses DNS-SD over IPv6 multicast to discover nodes, which is quite nice.

RangerMauve commented 5 years ago

Now that we're about to have a big breaking change in the protocol, this is back on the table.

DougAnderson444 commented 5 years ago

Cool! Where can we read up on the breakthrough change? (Is it the multiwriter stuff?)

RangerMauve commented 5 years ago

There's no comprehensive post with everything in it yet, but you can get started here: https://twitter.com/mafintosh/status/1177259694441861120

dat-ecosystem-archive / DEPs

DEP-0005: Move DNS TXT record to a dedicated subdomain #45