NLnetLabs / draft-toorop-dnsop-dns-catalog-zones

Work on catalog zones
3 stars 11 forks source link

Version 04 #33

Closed wtoorop closed 2 years ago

wtoorop commented 2 years ago

Just accumulated changes for version 4

wtoorop commented 2 years ago
1. Terminology

(two levels below $CATZ) it does not make sense to mention here (this way), more misleading than helping without the context of chapter 4.3

I agree.

Done in a62baaab

1. Description

A list of DNS zones misused bullet list with just one bullet. Rephrase into normal sentence?

+1

Fixed in d8d5e603

Authoritative servers may be preconfigured with multiple catalog zones. i would just keep this beginning of the paragraph and delete the rest, since different member zone config can be also achieved with groups, and this problematics will be elaborated upon later.

I suggest going even further and removing that sentence too. If we don't define those multiple zones here, then implementers will not know what to do.

Ack! In: 1eae6858

It is not expected that the content of catalog zones will be accessible from any recursive nameserver. How about promoting this to SHOULD NOT or something stronger? Anyway, this is declared again (and better) in section 6.1.

I think MAY would be more appropriate. I do not immediately see a reason why there should be a strong recommendation to disallow querying through a resolver, in general. It entirely depends on the use case on the producer side.

Thanks @peterthomassen , done in 79aa86a2

5.1. The Change of ownership (Coo) Property A Change Of Ownership is signaled... this paragraph has broken formatting.

Fixed in 292595bf

coo.<unique-N>.zones.$OLDCATZ 0 IN PTR zones.$NEWCATZ just a suggestion, discussion welcome. There are three possibilities of how this could be arranged: a) ... PTR $NEWCATZ (equivalent, but shorter) b) ... PTR zones.$NEWCATZ (as is now) c) ... PTR <new-unique-N>.zones.$NEWCATZ (requires knowledge of new unique-N) Which one is the best?

a

Done in 1496f08c

All associated state for the zone (...) is in such case reset, unless the epoch property (...) is... is this really useful? I would expect that the COO transition will be reset-less by default.

I assumed a COO would be a (seamless) reset. Epoch seems like a reasonable tiebreaker to me.

I also assumed reset (because you don't want the new owners to inherrit DNSSEC keys and sunch). So I leave it unchanged for now. Speak up if you think this is wrong.

If we turn it around, and COO transition is reset-less, then -after- the transition, epoch is what is used for a reset. It makes sense to me to combine them so the reset can happen at the right time (the COO-transition).

Also @Habbie . This is an example where we still need epoch (you doubted if it was still useful for zone resets).

The new owner is advised to increase the serial of the member zone after the ownership change, so that the old owner can detect that the transition is done. this is IMHO completely wrong and misleading suggestion. Might it happen that the update (SOA BUMP) of the member zone gets propagated quickly (still being configured by the old catalog), while the catalog updates fall behind?

I support completely removing the paragraph.

I have removed it completely in 39ffe544 , but we could also leave a bit of text, like:

The old owner may remove the member zone containing the coo property from $OLDCATZ once it has been established that all its consumers have processed the Change of Ownership."

WDYT?

5.4. The Serial Property Nameservers that are secondary for that member zone (multiple times throughout the chapter): how about using "catalog consumers" according to the terminology (chapter 2) ? This might apply in different places in the document.

+1

I think a catalog consumer is not the same as a secondary b.t.w. I do have tried to make usage of catalog consumer and catalog producer consistent throughout the document in commits e358d0ef and 579c57b7

Please review carefully

A refresh timer of a catalog zone MUST NOT be ignored. what does this sentence actually declare? Is it just a reminder for the reader not to misinterpret the previous one? In such case, it shouldn't be probably written in such normative language. How does this apply to the (exotic) case of a catalog zone being a member of other catalog zone?

I don't like the whole paragraph. Consider:

  • my catalog zone has a 1H refresh
  • my member zone has a 5min refresh
  • now my member zone can fall behind by an hour instead of by five minutes if NOTIFY is not correctly working

Okay, I removed the paragraph completely in 23d6fca7 , but I do think optimization is possible (reducing the number SOA queries). Should we add text that suggests to reset member zone refresh timers after the catalog zone that has that as member with a serial property has refreshed or been updated? WDYT?

6.2. Member zone removal Only when the zone was configured from a specific catalog zone, and the zone is removed as a member from that specific catalog zone, the zone and associated state (such as zone data and DNSSEC keys) MAY be removed. wouldn't a MUST be better here? It seems to be anyway needed for the usefulness of section 6.5. 6.5 is a fallback mechanism, of course, but indeed it does not work without a MUST here.

Agree. Changed to MUST in: e426a538

IN GENERAL: the possibility of resetting the member zone by changing its <unique-N> has been removed? Any discussion around this topic?

epoch can do that. However, it looks like we currently do not cover 'what to do if unique-N changes' at all, which is bad.

Added that in a2d69eec That commit also changes "Zone associated state reset" to use a member node name change as the default mechanism for that. epoch is still needed for maintaining state when migrating with coo. (and for state reset when <unique-N> is predictable but that should be in a future document I now think)

1. Security Considerations

A primary nameserver SHOULD NOT serve a catalog zone for transfer without using TSIG and a secondary nameserver SHOULD abandon an update to a catalog zone that was received without using TSIG. i don't like this very much. I agree with a SUGGESTION to use TSIG, but this is too requiring. How about safe networks (where the XFR does not traverse through the internet), or other ways of securing the XFR (e.g. TLS) ?

I agree the language is too prescriptive.

Changed in accf15cd

1. Acknowledgements

I don't want to be inproper, but those thankyous seem to be a little bit outdated. We should probably keep listing and appreciating the authors of the idea, but they are no longer authors of any substantial part of the document.

If they were authors of substantial parts, we might list them as authors! I think it's very important to acknowledge everybody that contributed in any form.

Agree, so I don't do that.

I also updated the Implementation status to include KnotDNS 3.1 in 59ac1f20

libor-peltan-cznic commented 2 years ago

The old owner may remove the member zone containing the coo property from $OLDCATZ once it has been established that all its consumers have processed the Change of Ownership."

+1

Should we add text that suggests to reset member zone refresh timers after the catalog zone that has that as member with a serial property has refreshed or been updated?

Sorry to not understand you. Do you mean like not only taking the information that a member zone has been updated from the Serial property, but also take the information that the member zone has NOT been updated and is up to date, and it should not be re-checkend until SOA refresh, counted since the information appeared, i.e. last transfer of the catalog zone? I like this idea, but i'm not able to write it down in an ellegant sentence :)

wtoorop commented 2 years ago

The old owner may remove the member zone containing the coo property from $OLDCATZ once it has been established that all its consumers have processed the Change of Ownership."

+1

Should we add text that suggests to reset member zone refresh timers after the catalog zone that has that as member with a serial property has refreshed or been updated?

Sorry to not understand you. Do you mean like not only taking the information that a member zone has been updated from the Serial property, but also take the information that the member zone has NOT been updated and is up to date, and it should not be re-checkend until SOA refresh, counted since the information appeared, i.e. last transfer of the catalog zone? I like this idea, but i'm not able to write it down in an ellegant sentence :)

Haha that is exactly what I meant yes! Well, your sentence looks better than what I wrote.. I'll try to come up with something.

libor-peltan-cznic commented 2 years ago

How about extending the existing paragraph:

Catalog consumers which are secondary for that member zone, MAY compare the serial property with the SOA serial since the last time the zone was fetched. When the serial property is larger, the secondary MAY initiate a zone transfer immediately without doing a SOA query first. The SOA query may be omitted, because the SOA serial has been obtained reliably via the catalog zone already.

with additional

When the serial property is equal, the secondary MAY postpone next refresh by SOA refresh value (counted since the transfer of the catalog zone), i.e. the same way as if it had queried the primary SOA directly and found it equal.
libor-peltan-cznic commented 2 years ago

At the end of chapter 5.3 (just before 5.3.1), I think catalog consumer is more suitable than secondary.

libor-peltan-cznic commented 2 years ago

Additional comments.

reminder from Mattermost:

other issues:

peterthomassen commented 2 years ago
  • the Timestamp resource type is somehow foreseen also in $CATZ apex, but there is not explanation on how to use it and what it should mean. Either remove this (in example and in (both for the catalog zone and for member zones) ), or describe

IIRC, the TIMESTAMP record at epoch.$CATZ was intended as a hash salt in case hashing is used to generate predictable member node labels; it could be incremented in case of collision. I don't remember another use case, so I think it is now obsolete.

Habbie commented 2 years ago

This:

When the serial property is equal, the secondary MAY postpone next refresh by SOA refresh value (counted since the transfer of the catalog zone), i.e. the same way as if it had queried the primary SOA directly and found it equal.

only works with TIMESTAMP I think? And I'm not sure this is enough reason to have TIMESTAMP at all.

libor-peltan-cznic commented 2 years ago

No, this is not related to timestamp.

Normally, the secondary queries the primary's SOA, and if up-to-date, it postpones the next check by the SOA refresh value.

This way, the secondary downloads the catalog zone, observers the equal member zone's serial, and postpones the next check of the member zone by its SOA refresh value.

Habbie commented 2 years ago

"The old owner may remove the member zone containing the coo property from $OLDCATZ once it has been established that all its consumers have processed the Change of Ownership."

+1

Habbie commented 2 years ago

I did not write that carefully. I meant to say that the following only works if the catalog zone itself is timestamped:

This way, the secondary downloads the catalog zone, observers the equal member zone's serial, and postpones the next check of the member zone by its SOA refresh value.

But that assumes that the catalog zone was fresh at fetch time. It could be hours old.

Habbie commented 2 years ago

I also assumed reset (because you don't want the new owners to inherrit DNSSEC keys and sunch). So I leave it unchanged for now. Speak up if you think this is wrong.

Using epoch to avoid a reset gives the receiving zone control of not resetting DNSSEC keys. This seems like an attack vector to me. Not resetting should be a mutual decision between the two parties. I'm also still entirely fine with -always- resetting on coo.

wtoorop commented 2 years ago

The old owner may remove the member zone containing the coo property from $OLDCATZ once it has been established that all its consumers have processed the Change of Ownership."

+1

Done in 5615ccc

How about extending the existing paragraph:

Catalog consumers which are secondary for that member zone, MAY compare the serial property with the SOA serial since the last time the zone was fetched. When the serial property is larger, the secondary MAY initiate a zone transfer immediately without doing a SOA query first. The SOA query may be omitted, because the SOA serial has been obtained reliably via the catalog zone already.

with additional

When the serial property is equal, the secondary MAY postpone next refresh by SOA refresh value (counted since the transfer of the catalog zone), i.e. the same way as if it had queried the primary SOA directly and found it equal.

I did not write that carefully. I meant to say that the following only works if the catalog zone itself is timestamped:

This way, the secondary downloads the catalog zone, observers the equal member zone's serial, and postpones the next check of the member zone by its SOA refresh value.

But that assumes that the catalog zone was fresh at fetch time. It could be hours old.

@Habbie if the serial property is provided by a catalog zone producer, the assumption is that the catalog zone value is kept up to date with the served serial number in the served zone. If that is not the case, then the catalog zone producer should not equip the member with a serial property in the first place.

Anyway, I do think this additional mechanism is in the spirit of the serial property. It would be half-hearted to leave it out. Though, I have weakened the description a bit by turning it into a configurable mechanism, and added a note about the necessity to keep the serial property in sync with the actual SOA serial number when using this mechanism, like this:

Secondary nameservers MAY be configured to postpone next refresh by the SOA
refresh value of the member zone (counted since the transfer of the catalog
zone) when the value of the `serial` property was found to be equal to the
served zone, the same way as if it had queried the primary SOA directly and
found it equal.  Note that for this mechanism it is essential that the catalog
producer is keeping the `serial` property up to date with the SOA serial value
of the member zone at all times. The catalog may not be lagging behind.
Increased robustness in having the latest version of a zone may be a reason to
**not** configure a secondary nameserver with this mechanism.

in commit 829f2ecb

At the end of chapter 5.3 (just before 5.3.1), I think catalog consumer is more suitable than secondary.

Not doing that in favor of removing the epoc property entirely, which is done in commit: 3f60ac5b

Additional comments.

reminder from Mattermost:

  • it would be useful to also mention that some other extensions (without private-extension label) may be added by subsequent RFCs (Sect 5.5)

Done in 17326c76

  • we can add for example into Chapter 1: Other use-cases of nameserver remote configuration by catalog zones are possible, where the catalog consumer might not be a secondary.

Done in 9148284f

  • extend Catalog consumers MUST ignore PTR RRsets with more than a single record. in sect 4.3 with that the limitation only applies to 'member zone name'

Done in 8c672baa

  • fix the weird sentence A catalog zone can be updated via DNS UPDATE on a reference primary nameserver, or via zone transfers. (discussion pending)

Doesn't make sense to me either. I just removed it in f1b94530

other issues:

  • the Timestamp resource type is somehow foreseen also in $CATZ apex, but there is not explanation on how to use it and what it should mean. Either remove this (in example and in (both for the catalog zone and for member zones) ), or describe

It is gone together with epoch in commit 3f60ac5b

  • the "reset" of member zone is mentioned in several places, in each case very (and differently) vaguely described. Maybe better not describe at all and always refer (link) to respective section

It is in just 2 places, but since 008b1811 all of them now reference subsection "Zone associated state reset" which is similarly vaguely described ;)

  • a nitpick: in the example for Serial property, all serials use Datetime policy. How about illustrating different serials (e.g. 3, 1634730530, 2021102002) ?

ok, in: 48f99093

  • I would swap section 6.2 and 6.3, so that it's more clear to the reader why the removal is so sensitive.

Done in 2272578e

  • Section 6.5: the link to the section with name clash does not work and #nameclash is displayed instead of the link

Done in e8866ede

From @Habbie on Mattermost:

I had a good chat with @mind04 about the current state of the draft. After that, I propose the following things: always reset on COO

This is how it is defined now since 3f60ac5b . Commit 008b1811 changed that again to make it shorter (referencing the section on zone reset instead of redescribing it again).

unique-N is the only reset mechanism otherwise

This is how it is defined now since the removal of the epoch property in: 3f60ac5b

SERIAL is going to be the big point of discussion, we don't even see how to sanely implement it - perhaps postpone it to a second document?

It is indeed describing something quite orthogonal to catalog zone maintenance (i.e. an enhancement building on catalog zones), but it is also defined here as something options, so why not.

get rid of TIMESTAMP for all purposes

Yes it is removed with the complete removal of the epoch property in: 3f60ac5b

rename private-extension to the much shorter ext

Done in 1a62e81a

(also, I still don't care much for group, and still don't think it'll survive a Last Call anyway, for lack of semmantics)

It does fit well with NSDs patterns and with KnotDNSs templates. The property is also optional to implement. For a manually edited catalog zone, it is less cumbersome to manage than the coo property.

From Mattermost: @Habbie

ok, another proposal: we go back to "unless unique-N is the same, MUST reset all zone and related data"; then if users are worried the other party is untrustworthy (and will imitate their unique-N on purpose), they can do an additional reset before or after the transfer. Maybe we can put a few words about that in Security Considerations. (this is not my favourite version, but it's one I can accept. It leaves open the question of "what does 'not resetting' mean?" - which means that not doing resets will involve talking to your operators anyway)

Described in 2244f46e