Open btrepp opened 1 year ago
Long-term I feel we should have system extensions which are critical and run always, and probably have a way to override/inject values into resolv.conf
, but many pieces are missing at the moment.
For the registry endpoint, you can use registry mirror config to resolve it to a Tailscale IP, as these are assigned in a static way.
@btrepp Maybe you can clear up my confusion.. I appear to be able to use Split DNS with the extension. However, I'm running Talos in a VM on a host machine that is itself part of the tailnet. Could this be the reason Split DNS works, because DNS queries are forwarded outside of the VM to the host's DNS, which is configured with Split DNS?
Search Domains is the feature that fails, presumably because it requires edits to /etc/resolv.conf, even if it's running in said VM.
I create CP nodes named cp-0
with the tailscale extension and set the Kubernetes endpoint to be cp.ts
. I've got CoreDNS running outside of Talos configured to answer with a CNAME pointing to cp-0.my-tailnet.ts.net
when queried for cp.ts
. This CoreDNS is configured for .ts
using Split DNS. Everything seems to work... Is it going to go horribly wrong at some point, assuming I keep the VM on a host in the tailnet?
It's when I configure Search Domains for ts
and use cp
as the Kubernetes endpoint that something seems wrong, namely that although everything seems Healthy and the node is Ready, the node can't reach the API server at cp
. Perhaps I could even configure libvirt's dnsmasq
to include the search domain...
Yep. I think basically dns will go up the stack.
For me. It's metal Talos -> router For you it would be Talos -> vm host.
As the extension runs in a container. It doesn't change the Talos Configs. I did experiment with modifying resolve.conf but ended up having a bad time with it.
On Mon, 21 Aug 2023, 08:37 Mike Beaumont, @.***> wrote:
@btrepp https://github.com/btrepp I appear to be able to use Split DNS with the extension. However, I'm running Talos in a VM on a host machine that is itself part of the tailnet. Could this be the reason Split DNS works, because DNS queries are forwarded outside of the VM to the host's DNS, which is configured with Split DNS?
Search Domains is the feature that fails, presumably because it requires edits to /etc/resolv.conf, even if it's running in said VM.
I create CP nodes named cp-0 with the tailscale extension and set the Kubernetes endpoint to be cp.ts. I've got CoreDNS running outside of Talos configured to answer with a CNAME pointing to cp-0.my-tailnet.ts.net when queried for cp.ts. This CoreDNS is configured for .ts using Split DNS. Everything seems to work... Is it going to go horribly wrong at some point, assuming I keep the VM on a host in the tailnet?
It's when I configure Search Domains for ts and use cp as the Kubernetes endpoint that something seems wrong, namely that although everything seems Healthy and the node is Ready, the node can't reach the API server at cp. Perhaps I could even configure libvirt's dnsmasq to include the search domain...
— Reply to this email directly, view it on GitHub https://github.com/siderolabs/talos/issues/7287#issuecomment-1685452175, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACAGFIOIBIUELN6HHHR6HDXWKUTPANCNFSM6AAAAAAYRQAZJA . You are receiving this because you were mentioned.Message ID: @.***>
This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This would definitely still be a great feature!
now that host-dns exists, maybe this is now possible to implement?
It should work in main
now with the Tailscale DNS endpoint being the first entry in nameservers
and your recursive DNS resolver being the second.
does that mean that Allow configuring certain domains to be forwarded to other DNS resolvers.
is in main
already (and not tied to tailscale)?
I don't know what you're talking about, sorry. I have no idea about Tailscale, all I said is that split DNS should work in main
now.
I do not known about tailscale either, since you were the one mentioning it, I wanted to clarify whether this feature was tied to tailscale. By your answer, I will assume, it's not tied to tailscale. :-)
How do I configure this? The 1.8 docs at https://www.talos.dev/v1.8/talos-guides/network/host-dns/ do not seem to mention how to configure this feature.
There is no feature at all, it will just correctly iterate over nameservers configured in case if one returns NXDOMAIN
/SERVFAIL
.
@smira AFAICT this doesn't happen with NXDOMAIN
https://github.com/siderolabs/talos/blob/7edcbbb833fc56b054ce9ecebc3416f676a51851/internal/pkg/dns/dns.go#L147 assuming we're talking about https://github.com/siderolabs/talos/pull/9179
Is there anything standing in the way of just switching to coredns for node DNS as a separate service?
It's not possible to workaround this either because the order of resolvers doesn't appear to be totally under the users control:
My router DNS seems to always show up first in the list, probably because it comes from DHCP before the machine config is applied.
I believe DNS server shouldn't return NXDOMAIN if it doesn't know about the domain, so the DNS server is wrong (if I'm wrong, easy to fix).
The DNS servers on initial boot before machine config is applied can be controlled via kernel cmdline, but the machine config overwrites any DNS servers configured by other means.
I believe DNS server shouldn't return NXDOMAIN if it doesn't know about the domain, so the DNS server is wrong (if I'm wrong, easy to fix).
I do agree, just wanted to make it clear it doesn't work with NXDOMAIN, only SERVFAIL.
I think the issue is that Tailscale uses <machine-name>.<network-name>.ts.net
as FQDNs but only returns records on its network-internal resolver. Since .ts.net
is a real domain, Cloudflare, for example, will return NXDOMAIN. But the network-internal resolver returns the machine IP on the TS overlay network.
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 1
;; AUTHORITY SECTION:
ts.net. 300 IN SOA ns1.dnsimple.com. admin.dnsimple.com.
;; Query time: 20 msec
;; SERVER: 1.1.1.1#53(1.1.1.1) (UDP)
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2
;; ANSWER SECTION:
my-machine.my-network.ts.net. 600 IN A 100.90.80.70
;; Query time: 0 msec
;; SERVER: 100.100.100.100#53(100.100.100.100) (UDP)
The DNS servers on initial boot before machine config is applied can be controlled via kernel cmdline, but the machine config overwrites any DNS servers configured by other means.
It doesn't, from my testing.
EDIT: removed irrelevant code refs
What I see:
❯ talosctl get resolverspec -o yaml
metadata:
namespace: network
type: ResolverSpecs.net.talos.dev
id: resolvers
spec:
dnsServers:
- fd7a:115c:a1e0::53
- 192.168.0.1
layer: configuration
$ dig @fd7a:115c:a1e0::53 my-machine.my-network.ts.net
my-machine.my-network.ts.net. 600 IN A 100.90.80.70
$ dig @169.254.116.108 my-machine.my-network.ts.net
ts.net. 10 IN SOA ns1.dnsimple.com. admin.dnsimple.com.
$ dig @192.168.0.1 my-machine.my-network.ts.net
ts.net. 10 IN SOA ns1.dnsimple.com. admin.dnsimple.com.
Probably it makes sense to create issues with full description for both, as I don't quite understand your case.
Your tailnet resolver should come before CloudFlare one.
DNS servers should be completely changeable with meachine config.
Just a heads up, since #9310
order of resolvers doesn't appear to be totally under the users control
Is no longer true. So this should be fixed now? By that I mean that with recent PRs second workaround from the original issue should work probably.
Feature Request
Allow configuring certain domains to be forwarded to other DNS resolvers.
Description
I've been developing a Tailscale extension to allow talos nodes to have Tailscale IPs (and the long term goal is to talk to backend services such as storage, over a Tailscale network).
https://github.com/siderolabs/extensions/pull/154
One of the issues is that it would be great to uses tail scales magic dns, so you can do things like 'nas' in your config files and dns will point you to the correct Tailscale machine.
Tailscale includes this, however it tries to write over /etc/resolv.conf. This works great if I bind mount it, but when things go wrong, they go really wrong.
Current workaround
At the moment you can run a DNS server externally and configure how you wish, but it does become more external infrastructure you need to maintain. Alternatively you can use your Tailscale IPs directly, but then you do have to make sure the IPs are aligned (and if talos wipes a disk, you are getting a new IP from Tailscale).