Open s-hamann opened 2 years ago
Replication is done on a per-domain (per-zone) level, not per record set. The proper object to add this would therefore be the domain object, and it could indicate whether all secondaries have a current copy. (Or perhaps indicate the last time a current copy was seen everywhere; if this is less than last time the domain was changed, it should reflect replication freshness. Have to think about it more.)
In any case, putting this at the domain object will also cover the "RRset deleted" case.
Depending on the replication mechanism this information is very hard to obtain, and conflicts with replication plans that we recently introduced (#571). I hence suggest we close this issue as "won't fix".
That being said, desec.io (not desec-stack) could publish information on how long DNS updates (typically) take. I believe currently we have some 99% of updates done in <1min.
For context: We are planning a cooperation with pch.net who will run some nodes for us, and they'll likely do some internal replication. The question will be how we can determine when replication has finished on their side. I'd hope that there would be some way to do that, not only for the purposes of this issue, but generally -- we should have insight into that.
For me as a reader of this bug it is unclear if the requirement is:
In my experience, most of the unforeseen disruptions happen because planning should have considered at least case 3, but effectively "just" modeled with case 1. In my opinion, cases 1 and 2 are values of purely academic value that are just misleading in practice and shoulnd't be offered at all. On the other side, values for case 3+ are often underestimated by large (e.g. many records have TTLs in the range of days) and are only found out "when it's too late". Thus, I think, the calculation should actually support the prediction of changes before they're commited and receive a prominent place in the UX.
For me as a reader of this bug it is unclear if the requirement is:
- to measure the instant when all authoritative nameservers (be them reachable via anycast or unicast) start serving a specific updated record,
This. It's important e.g. for ACME clients to know when it's safe to tell the server that the ACME challenge can be found in the DNS. There might be other reasons why a user might want to know which version of their zone is served in which region.
- to compute the worst-case instant that any hypothetical (caching) querier behind a caching resolver is guaranteed to receive the updated record, ...
This has / We have nothing to do with caching resolvers; authoritative DNS ends with putting the zones in place.
values for case 3+ are often underestimated by large (e.g. many records have TTLs in the range of days) and are only found out "when it's too late". Thus, I think, the calculation should actually support the prediction of changes before they're commited and receive a prominent place in the UX.
That may be a good idea, but it's a different issue.
In fact, calculating 3+ is only possible based on the information about what the propagation status on authoritative servers is. So, implementing the feature discussed here is a prerequisite for your feature.
The propagation status is publicly available in DNS. The status required to calculate the delays for a planned change are in DNS before and up until the change is pushed and propagated.
The SOA record declare the timings until all potentially anycasted servers are returning coherent answers again. In case of true multi-master hosting the SOAs will have different serial numbers and then each zone must be considered in parallel (this is perfectly legal and Microsoft Active Directory integrated DNS is a prominent case).
ACME is indeed a very practical usecase, and actually the most concrete I see causing confusion very often.
Let's assume for a moment the day we're changing this record is also the day we were unhappy with our previous hoster and decided to migrate our zone to a cool new one. The aforementioned resolver (e.g. a letsencrypt verifier) has cached our old NS reconds and our A record just moments before. Now we're pushing new (delegation and) NS records and a changed A record on our new authoritative servers. The aforementioned resolver is then asked again about our A record, which it realizes has expired. It still holds valid cached NS records though, because our previous DNS hoster and the parent zone had choosen to serve them with high TTL. Thus, the resolver will start recursive resolution, but using cached records will shortcut to querying our previous providers nameservers, possibly obtaining a tecnically valid response, but not the one we expected.
$ dig +trace www.desec.io A
. 21098 IN NS i.root-servers.net.
. 21098 IN NS c.root-servers.net.
io. 172800 IN NS b0.nic.io.
io. 172800 IN NS a2.nic.io.
desec.io. 3600 IN NS ns1.desec.io.
desec.io. 3600 IN NS ns2.desec.org.
desec.io. 900 IN A 88.99.64.5
www.desec.io. 3600 IN CNAME desec.io.
So I get that www.desec.io resolves to 88.99.64.5, valid for 15 minutes.
Let's say, I want to migrate desec.io to another DNS hoster and update the A record.
Let's see how quick the zone can be migrated, i.e. how long the delegation records persist in caches:
dig desec.io NS @c0.nic.io.
;; AUTHORITY SECTION:
desec.io. 3600 IN NS ns1.desec.io.
;; ADDITIONAL SECTION:
ns1.desec.io. 3600 IN A 45.54.76.1
That's one hour.
Now let's consider for how long desec.io servers think they should be cached:
$ dig desec.io NS @ns1.desec.io.
;; ANSWER SECTION:
desec.io. 300 IN NS ns1.desec.io.
;; ADDITIONAL SECTION:
ns1.desec.io. 900 IN A 45.54.76.1
The informational NS record indicates 5 minutes (while the parent zone requires one hour authoritatively).
So, just from the zone data we would expect the update to propagate in less than 15 miutes. But in fact, by glossing over this example, a proper answer can't be less than 75 minutes (900 seconds for the A record and 3600 seconds for the NS delegation).
The SOA record declare the timings until all potentially anycasted servers are returning coherent answers again.
No.
The aforementioned resolver (e.g. a letsencrypt verifier) has cached our old NS reconds and our A record
How do you know that? (I would be surprised if Let's Encrypt's challenge fetching follows standard TTL rules.)
The informational NS record indicates 5 minutes (while the parent zone requires one hour authoritatively).
The parent is not authoritative for the NS records, as indicated by the absence of the AA bit in the response:
$ dig NS desec.io. @a0.nic.io | grep "flags: "
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 3
Only the answer from the child has the AA bit:
$ dig NS desec.io. @ns2.desec.org | grep "flags: "
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 3
For some applications (ACME challenges, TLSA records, ...), it is interesting to know, if a record that was just added/updated/removed/... has fully propagated to the authoritative nameservers. Just querying the authoritative nameservers is not sufficient, due to the anycast network. Some frontend servers may already have the data, others may not, but I can only query those, that are closest to me.
Particularly when the network is slow to update, it would be useful to have some way to find out if all servers have/publish the same information. Since I do not believe this can be done in DNS, I suggest adding it to the API.
I don't have a clear idea, of how this should look like. It could be an additional JSON field that is returned for all RRSets. I'm not sure how to indicate the propagation status of record removal. Another option might be to provide a list of pending changes.