StackExchange / dnscontrol

Infrastructure as code for DNS!
https://dnscontrol.org/
MIT License
3.07k stars 389 forks source link

Are IMPORT_TRANSFORM cnames wrong #32

Closed captncraig closed 7 years ago

captncraig commented 7 years ago

Currently if I do:

D("old.com",...
  CNAME("c", "google.com.")
)
D("new.com",...,IMPORT_TRANSFORM("foo.com"))

I will end up with:

CNAME c.old.com.new.com. -> google.com.new.com.

But nowhere else is that record defined in an A record or anything because it is outside the zone.

I'm wondering if the behaviour for CNAMES should first check if the cname is a FQDN outside the old zone, and if so, do nothing to it at all.

That would give us instead:

CNAME c.old.com.new.com. -> google.com.

tlimoncelli commented 7 years ago

The goal of IMPORT_TRANSFORM is to create a subdomain that is used for API calls. That is, calls that would normally go to X.old.com now need to go to X.old.com.new.com.

At this time, no code tries to connect to c.old.com. (or c.old.com.new.com.), so it doesn't matter that the CNAME is broken. It will never be used.

The cost of unused DNS records is a few kilobytes on the DNS servers. That is a lot less expensive than creating/debugging/maintaining code that doesn't affect production. Not generating those records would have cosmetic value, but doesn't affect production.

A little bit of SRE philosophy...

A big source of human error is mental-model mismatch. That is, when configuring/operating a system, the person has a mental model of what is going on in the system. They are, essentially, emulating the software in their head to predict that the change they are making will have the desired result they want. The more complex the system the less likely the mental model will match it. When there is a mismatch it leads to confusion, frustration, and more importantly it increases the risk of operating error the creates production problems. If the rules are simple ("the transformation always appends the new.com. domain") the mental model will be more accurate than if if it is complex ("the transformation appends new.com. but only if the target ends in old.com., plus serverfault.com and other domains in a whitelist; unless the record is annotated with the INCLUDE_IN_TRANSFORM keyword, etc. etc."). What if the rules include a little-known exception to handle a corner-case? It may seem like a good corner-case to fix, but not if it adds "mystery" to the system.

One could argue that unused DNS records adds mystery to the system. Someone looking at the DNS zone would see "google.com.new.com." and think it is a bug. However, it is easier to explain "unused records may not be valid" than a set of complex rules that eliminate such records. We can also point them to this bugid for more info.

One more thing...

"Future proofing is not adding stuff. Future proofing is making sure you can easily add code/features without breaking existing functionality." Imagine we do fix this cosmetic issue. Now imagine that at a later date there is a production reason why we need to change CNAME processing. We would have to fix the code and be careful not to break backwards compatibility with users that have grown to depend on the cosmetic change. In the worst case, the cosmetic change will have grown dependencies and will now be a production requirement. If the needed production change conflicts with the cosmetic change, we would be in a bind. In the best case, the code change is just more complex and requires more testing. On the other hand, if we leave these broken CNAMEs and wait for the day when we have a production-related feature request, implementing that feature will be easier. (Basically this is the rule that "Future proofing is not adding stuff.

captncraig commented 7 years ago

Sounds fair to me. I'm wondering if import transform will be useful for anybody else ever, since this is a pretty specific use case. So it goes though.

nicollet commented 7 years ago

DNAME was a way of doing this, if i understand correctly. Not sure it is implemented everywhere though, even if it is quite old.

On Feb 1, 2017 4:22 PM, "Craig Peterson" notifications@github.com wrote:

Closed #32 https://github.com/StackExchange/dnscontrol/issues/32.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/StackExchange/dnscontrol/issues/32#event-945479454, or mute the thread https://github.com/notifications/unsubscribe-auth/ADWQdamEFKq6S0uUdDBlenSlrgMdaRgdks5rYPeAgaJpZM4LzhLC .

nicollet commented 7 years ago

btw, sorry to think about it only now, but that could help fix our * cname issue for the failover for free.

On Feb 1, 2017 10:13 PM, "Xavier Nicollet" xnicollet@gmail.com wrote:

DNAME was a way of doing this, if i understand correctly. Not sure it is implemented everywhere though, even if it is quite old.

On Feb 1, 2017 4:22 PM, "Craig Peterson" notifications@github.com wrote:

Closed #32 https://github.com/StackExchange/dnscontrol/issues/32.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/StackExchange/dnscontrol/issues/32#event-945479454, or mute the thread https://github.com/notifications/unsubscribe-auth/ADWQdamEFKq6S0uUdDBlenSlrgMdaRgdks5rYPeAgaJpZM4LzhLC .