hackclub / dns

🕹 Manage Hack Club's DNS through a GitHub repository
141 stars 347 forks source link

NS record for nest not being served #1017

Closed reesericci closed 10 months ago

reesericci commented 11 months ago

Recently, a change landed that delegated hackclub.app to our nameserver via an NS record. It appears that the NS record either didn't propagate to DNSimple, or DNSimple isn't serving it correctly.

Evidence:

image

When digging hackclub.app, it should return the IP address with an authoritative of ns.hackclub.app. Rather, it's returning nil, with an authoritative of ns1.dnsimple.com. This shows that the NS record didn't take effect and delegate to our nameserver.

maxwofford commented 11 months ago
Screenshot 2023-11-14 at 12 18 54

Here's what shows up currently on dnsimple's dashboard

maxwofford commented 11 months ago

Interesting... looks like NS is managed on a different area of dnsimple:

Screenshot 2023-11-14 at 12 20 14
maxwofford commented 11 months ago
Screenshot 2023-11-14 at 12 21 16

I've manually removed them from the dashboard for now while testing. Give it some time to propagate and try testing it again. If this works let's find a good way to document this.

polypixeldev commented 11 months ago

Interesting - hasn't propagated yet, but I tried manually asking the DNSimple server with dig, and it seems that it didn't parse the as a wildcard. NS record on hackclub.app is still DNSimple servers, and no NS record on a subdomain of hackclub.app (such as identity.hackclub.app), but when I asked for NS records on .hackclub.app (shown in screenshot), it answered with the right info. image

grymmy commented 11 months ago

I guess perhaps this could just be propagation lag - but, if not, gonna try to sum up the above (please correct me if these assertions are wrong):

  1. the change seems to be showing up as expected on the DNSimple admin UI
  2. at least one of dnsimple's main nameservers is reporting what we expect to see on when specifically checking *.hackclub.app
  3. we do not see nslookup hackclub.app ns1.dnsimple.com resolve (I see that on my side)

RE: 3 - can you @reesericci cite documentation/specification as evidence as to why we should expect that specific PR to cause to resolve to something?

More evidence that I've collected:

nslookup via 4.2.2.1:

Screenshot 2023-11-15 at 4 26 49 PM

nslookup via dnsimple:

Screenshot 2023-11-15 at 4 26 59 PM

nslookup via the nameserver I believe you guys own that should be serving the correct (desired) ultimate resolution:

Screenshot 2023-11-15 at 4 26 34 PM
reesericci commented 11 months ago
  • can you @reesericci cite documentation/specification as evidence as to why we should expect that specific PR to cause to resolve to something?

NS records are the standard way of specifying an authoritative DNS server for a domain, see RFC 1035. Our nameserver is supposed to be the authoritative DNS for the domain. Now DNSimple says that NS records are used for delegating subdomains, so maybe it doesn't work on a root/wildcard? See DNSimple docs

grymmy commented 11 months ago
  • can you @reesericci cite documentation/specification as evidence as to why we should expect that specific PR to cause to resolve to something?

NS records are the standard way of specifying an authoritative DNS server for a domain, see RFC 1035. Our nameserver is supposed to be the authoritative DNS for the domain. Now DNSimple says that NS records are used for delegating subdomains, so maybe it doesn't work on a root/wildcard? See DNSimple docs

Does that give you sufficient info to send another pr to fix it? One would think this is a somewhat common use case

reesericci commented 11 months ago

I honestly don't know why it isn't working.

polypixeldev commented 11 months ago

I think this just isn't configurable using octodns - see this issue I found https://github.com/octodns/octodns/issues/38 From my understanding, DNS providers just don't support setting apex NS records through the APIs provided, so octodns doesn't really support it. So our only option then is to manually make changes in DNSimple

grymmy commented 11 months ago

@polypixeldev Thoroughly convinced that you're blocked given the above. Will make the modification via the DNSimple WebUI tomorrow to unblock you.

grymmy commented 11 months ago

Maybe a really stupid question (or extra credit question, perhaps...?) if this wasn't supportable in octodns, maintainer is aware, did it complain in a log somewhere to that effect and we missed it?

polypixeldev commented 11 months ago

Not that I see - looking at the log for the deploy action when the root NS record was merged (https://github.com/hackclub/dns/actions/runs/6853213157/job/18633528798), it only shows the previous A records being deleted. So it just skips over any root NS records.

grymmy commented 11 months ago

I have made what I believe is the correct change in DNSimple. Gonna wait a while for it to propagate before I mark this closed/fixed.

Screenshot 2023-11-22 at 2 20 01 PM
polypixeldev commented 10 months ago

I really have no idea why it's not working. dig hackclub.app ns works and answers with ns.hackclub.app, and dig ns.hackclub.app answers with the correct IP, and dig hackclub.app @ns.hackclub.app answers with the correct IP as well, but dig hackclub.app doesn't answer with anything. It seems that for some reason, DNSimple doesn't tell the recursor that the hackclub.app NS record exists unless you explicitly ask it for NS records on hackclub.app.

It doesn't make sense because I tried tracing the query with dig hackclub.app +trace, and whenever the root nameservers or the nameserver for all the .app domains gets a query for hackclub.app A records, it responds with the corresponding NS records for hackclub.app (which point to DNSimple's nameservers).

So I thought it might be a DNSimple issue, but to confirm, I tried messing with DNS on https://messwithdns.net by setting the same records, and the same thing happened - even if I set the NS record on a subdomain, it does absolutely nothing.

Since our authoritative nameserver (CoreDNS) that we have set up on the Nest VM never even gets queried with a normal dig hackclub.app query, I don't think it's an issue with that.

Only thing that would be left to try is point the domain to our nameserver directly, bypassing DNSimple (assuming the domain isn't bought with DNSimple?).

tl;dr: Maybe it's a DNSimple issue, maybe we don't understand NS records, maybe our use case just doesn't work with DNS - I really have no idea. Any ideas are appreciated!

reesericci commented 10 months ago

I think the best solution here is to do what I originally proposed in #995 and just point it at the registrar level instead of trying to jump through hoops with DNSimple and NS records.

grymmy commented 10 months ago

Here is my determination:

a) I do not feel comfortable delegating control of hackclub.app to the nest nameserver. It violates the declarative/open definition we have for DNS at hackclub, we do not want domain name resolution to go through a black box. b) Based on my understanding of the use cases nest is trying to support I question whether this is the proper approach in the first place