home-assistant / plugin-dns

CoreDNS implementation for Home Assistant
Apache License 2.0
19 stars 14 forks source link

Fix local DNS resolution issue #59

Closed fenichelar closed 2 years ago

fenichelar commented 2 years ago
  1. Always try local DNS before Cloudflare
  2. If DNS servers are configured manually and via DHCP, only use manually configured DNS servers not both

Note, this does not remove Cloudflare DNS server fallback.

Zixim commented 2 years ago

Please explain why "fallback should be used" ? What does cloudflare fallback solve, in detail , please? Help us understand the reasoning ?

fenichelar commented 2 years ago

@frenck Can you please elaborate? Fallback is used: https://github.com/home-assistant/plugin-dns/blob/d5a9e507f5933f10a160268b71cace51277f3771/rootfs/usr/share/tempio/corefile#L20 I think there is some confusion on how coredns works.

The solution in this PR does not match that. You already knew that, as that previous PR is yours as well... yet you choose to ignore it?

This is a different issue.

fenichelar commented 2 years ago

This PR has nothing to do with failover. Right now, if DNS servers are both manually configured and provided via DHCP, then both the manually configured and DHCP provided servers are used for load balancing. The manually configured DNS servers should overwrite the DHCP provided servers.

frenck commented 2 years ago

The manually configured DNS servers should overwrite the DHCP provided servers.

I understand that is what you are trying to achieve, yet, we consider the DHCP servers a possible fallback/route as well, which is removed when servers have been manually set up.

So, for the OS-based installs, resolving anything is considered more important than anything else.

alexdelprete commented 2 years ago

Fallback should be used.

No software app should have hard-coded fallbacks that users can't modify. Plus, they're also not documented anywhere. If HA gets network config via DHCP, the only DNS it should use is that one.

Hard-coded DNS is used by chinese manufacturers (or Google/Amazon cloud devices) in their cameras, and most security-aware users block traffic from any LAN client to external DNS servers, rightly so. There's no reason HA needs a fallback server, if the local DNS doesn't work, it should throw an error so the user can check, otherwise we can say HA depends on the internet, when its entire approach is basically "as local as possible".

Does Linux or any other OS have hard-coded fallback servers in its code? I don't think so...

The possibility of disabling fallback can be added using within these scopes:

Pascal said that even if those things are implemented, the system would be considered unsupported, can you confirm? If so, why?

Thank you for the discussion.

fenichelar commented 2 years ago

@frenck

So, for the OS-based installs, resolving anything is considered more important than anything else.

The current implementation doesn't resolve local dns names....

Zixim commented 2 years ago

So, for the OS-based installs, resolving anything is considered more important than anything else.

The current implementation sends local hostnames to Cloudflare.

frenck commented 2 years ago

The current implementation doesn't resolve local dns names...

It does actually; for example, it gets mDNS local domains as known here: https://github.com/home-assistant/plugin-dns/tree/master/plugins/mdns; main reason for this is to work around issues with multicasting and Docker.

As for my personal setup, I used even split DNS with custom local domains in the current method.

Pascal said that even if those things are implemented, the system would be considered unsupported, can you confirm? If so, why?

Because customizations on a managed systems are simply not supported by the core team.

Zixim commented 2 years ago

Why is a local DNS considered a customization ? And why would you want to override it ?

jjvandenberg commented 2 years ago

The current implementation doesn't resolve local dns names...

It does actually; for example, it gets mDNS local domains as known here: https://github.com/home-assistant/plugin-dns/tree/master/plugins/mdns; main reason for this is to work around issues with multicasting and Docker.

As for my personal setup, I used even split DNS with custom local domains in the current method.

according to the still open issue #54, it resolves .local names only for a few moments though :-(

alexdelprete commented 2 years ago

So, for the OS-based installs, resolving anything is considered more important than anything else.

resolving is fundamental, you are right, the problem is relying on hard-coded external dns servers when users have a local dns configured through DHCP or manually. Current implementation is so slow, prone to local resolution errors, and inefficient that it's strange you are not willing to accept changes that make perfectly sense.

If I configure a docker container or an OS via DHCP or manually, I want it to use the DHCP or manual configuration, I don't want it to decide which dns to use, it's not up to the app.

alexdelprete commented 2 years ago

It does actually; for example, it gets mDNS local domains as known here: https://github.com/home-assistant/plugin-dns/tree/master/plugins/mdns; main reason for this is to work around issues with multicasting and Docker.

HA needs a properly configured DNS forwarder, not mDNS, bonjour, zeroconf, and automatic stuff that is SO unreliable...

mDNS, zeroconf, bonjour are all things that can work in a small client environment, and are highly not recommended by people that manage networks, for several reasons, also security:

image

alexdelprete commented 2 years ago

Because customizations on a managed systems are simply not supported by the core team.

Nope. If the PRs mentioned by Pascal are implemented, he said it would be unsupported. But if those PRs become part of the main code, why should it be unsupported? giving an option to users to disable something that currently is not even documented, and they decide is not needed, how can it be considered unsupported?

Frenck, why are you marking all my comments as off-topic? I replied to your comments, we're discussing, if my reply is off-topic it means your argument is off-topic. Doesn't make sense. You said you were willing to discuss anything, and we're doing it respectfully.

I'm not attacking anybody here, we're discussing technical things about DNS in the CoreDNS plugin github space, how can it be off-topic? I really can't understand why you're so sensitive on this plugin...it's not a war, we're trying to help improve HA, because clearly there are issues on DNS. Denying the issues won't solve them.

frenck commented 2 years ago

You said you were willing to discuss anything

This is not a generic discussion but a PR review. Your comments are general architectural comments and opinions on decisions made in the past for good reasons and are not reflecting the actual code being reviewed here. Please use or community forums or Discord chat if you'd like to discuss these types of things with others.

Thanks.

fenichelar commented 2 years ago

@frenck do you want to discuss via email or something? I'm looking for directions on how to move forward

frenck commented 2 years ago

@frenck do you want to discuss via email or something? I'm looking for directions on how to move forward

No? The direction to move forward has been pointed out already. I'm not sure what else you want from me at this moment. The project is willing to accept the capability of disabling fallback methods within the parameters pointed out already.

fenichelar commented 2 years ago

I'm not sure what else you want from me at this moment.

I asked a few follow up question and never got a response...

frenck commented 2 years ago

I asked a few follow up question and never got a response...

Sorry, maybe I missed something because of all the side spamming in this review. Could you point me out the question?

alexdelprete commented 2 years ago

This is not a generic discussion but a PR review.

I replied to your comments in the PR review, and I provided counter-arguments to what you said. You are clearly not willing to openly discuss, marking everything as off-topic, when my replies are to YOUR in-topic comments regarding the PR. So if I contradict you on something you wrote about the PR, it's off-topic, when it's simply a counter-argument to your argument.

This is not the proper way to discuss a change request, you are clearly in a very defensive position, without explaining transparently and clearly what are the technical reasons behind some implementation decisions.

I don't want to add anything more, because unfortunately there's clearly an asymmetry here: you can close the PR, you can silence anybody, you can decide what pertains or not to the discussion, and we're just passive users that are trying to contribute to an open-source project. This behaviour is not appropriate for an open-source community project.

I love HA, too bad seeing all this unwillingness to collaborate and discuss...it's really sad...

I'm out: it's clear they will never change anything about the dns implementation. we'll have to maintain it manually...

Thanks Frenck.

fenichelar commented 2 years ago

That will not solve the main issue which we solved with this. If you want this then you need to do the following:

What issue?

  1. Is the concern that no DNS servers will be configured (either via DHCP or manually)?
  2. Or is the concern that DNS servers will be configured (either via DHCP or manually) but won't respond to DNS queries Maybe due to DNS server failure, incorrect DNS server IPs being configured, firewall rules, etc.
  3. Or is the concern that DNS servers will be configured (either via DHCP or manually) but won't forward queries and will return NXDOMAIN for all external domains?
  4. Something else?

Open a PR to Supervisor and add an option to disable fallback which marks your system as unsupported and extend it to the API + tests

  1. What should be config flag be called?
  2. What should its type be?

Create an PR to the developer documentation for having it there

Okay

Create an PR to CLI repository for using that options

Okay

Create an PR to this repository to get this options in place

How should this work? My goal would be to find a way to make HA use the DNS server provided via DHCP and nothing else.

fenichelar commented 2 years ago

@frenck Would it make more sense to just have a way to disable plugin-dns and have HA core use DHCP provided or manually configured DNS servers directly?

frenck commented 2 years ago

What issue?

Multiple! The OS is aimed at the average Joe running this at home, with limited knowledge. We have had to deal with many many many different issues and network configurations (and misconfiguration) causing an absolute flood of issues. This has slowly (over time) resulted in what you see here.

Most of the issues currently still there, are often the result of customizations or advanced network setup. This is fine, however, that is not the target audience. As a matter of fact, the number of DNS / name resolving issues hasn't been as low as it is right now.

Broken local DNS, wrongly set up split DNS, misconfigured ad blockers, DNS loops, DNS of provides down causing issues, misconfigured Docker setups or virtual machines, wrongly configured firewalls, and many more.

What should be config flag be called? What should its type be?

A feature flag? I would say a boolean? Name... I dunno, something descriptive, however, that is just a name. Make it something descriptive, in the review of that specific PR that can be discussed (and adjustments to a name can be made quickly in general).

How should this work? My goal would be to find a way to make HA use the DNS server provided via DHCP and nothing else.

Add a flag to the supervisor and describe, pass the flag to the configuration of the DNS plugin so it can use it during the configuration generation.

However, that is a bit out of scope for this PR as described I would say.

Would it make more sense to just have a way to disable plugin-dns and have HA core use DHCP provided or manually configured DNS servers directly?

I don't think so, that will cause a different world of issues, as multiple parts of the system will assume it is there.

fenichelar commented 2 years ago

Broken local DNS, wrongly set up split DNS, misconfigured ad blockers, DNS loops, DNS of provides down causing issues, misconfigured Docker setups or virtual machines, wrongly configured firewalls, and many more.

Cloudflare DNS is currently setup as both a forwarding DNS server and a failover DNS server. I don't understand why this was done. It is hard to discuss a PR allowing this behavior to be changed when the reason for the behavior is unclear. Do you know why Cloudflare DNS was setup as both a forwarding DNS server and a failover DNS server?

Add a flag to the supervisor and describe, pass the flag to the configuration of the DNS plugin so it can use it during the configuration generation.

But what configuration would this result in?

  1. Would it forward only to manually configured DNS servers (if configured) and DHCP provided DNS servers otherwise?
  2. Would it forward to both manually configured and DHCP provided DNS servers?
  3. What about the Docker DNS server?
  4. Something else?

However, that is a bit out of scope for this PR as described I would say.

How do I discuss a PR I am working on then?

frenck commented 2 years ago

Cloudflare DNS is currently setup as both a forwarding DNS server and a failover DNS server. I don't understand why this was done. It is hard to discuss a PR allowing this behavior to be changed when the reason for the behavior is unclear

Resolving anything is more important than solving nothing. The reason I gave in my last answer.

But what configuration would this result in?

Disabling any fallback I would say? That is the goal you are trying to achieve, right?

How do I discuss a PR I am working on then?

This PR is titled: "Fix local DNS resolution issue", you are now discussing adding a supervisor feature flags for disabling fallback. I'm sorry, but that feels offtopic and not related to the content written in this PR (nor can this PR be modified to achieve that even remotely).

fenichelar commented 2 years ago

(nor can this PR be modified to achieve that even remotely).

The first PR you closed could have... but it has been locked so I can't ask for feedback now. Read the discussion in the PR, I actually asked about a feature flag before you even reviewed it. There was a lot of off topic discussion, but not from me. I have just been trying to get feedback on my PR.

I give up :( nevermind

alexdelprete commented 2 years ago

I give up

That's the ultimate goal of these discussions, a nice playbook. ;)

frenck commented 2 years ago

The first PR you closed could have

Sorry it did not, the feature flag needs to be added to the supervisor as described before. None of the PRs I closed added a flag to the supervisor.

There was a lot of off topic discussion, but not from me.

Yeah I know, sorry about that, its really distracting :(

I give up :( nevermind

:( If you change your mind and are up for a shot, let me know

frenck commented 2 years ago

@alexdelprete This formal and final warning. That the last comment is unneeded and condescending (and not the first) another one will result in a ban.

fenichelar commented 2 years ago

Sorry it did not, the feature flag needs to be added to the supervisor as described before. None of the PRs I closed added a flag to the supervisor.

Supervisor just contains the flag, not the logic for what the flag does. I would assume it makes more sense to discuss what the flag does in the plugin-dns repo than the supervisor repo. We are essentially talking about wrapping my existing PR behind a flag, all of the code is still applicable, it would just be wrapped in an if statement.

:( If you change your mind and are up for a shot, let me know

If you are willing to work with me, I would like to tackle this. But I am not going to start working on the PR for supervisor to add a flag until after we figure out what the flag actually does. I don't know about you, but I write code THEN documentation, not the other way around.

I will submit a new PR to plugin-dns for us to discuss shortly. The change will be behind a feature flag that doesn't yet exist. I will submit PRs to add the feature flag after.

frenck commented 2 years ago

I don't know about you, but I write code THEN documentation, not the other way around.

This is not about adding documentation, this is about adding the controls. If there aren't controls and flags created in the supervisor, there will be nothing to test or control to implement.

As for documentation that will be required before merging the final PRs 😉

I will submit a new PR to plugin-dns for us to discuss shortly.

If you do, please do that in conjunction with the matching supervisor and CLI PRs that go with it.

fenichelar commented 2 years ago

@frenck

If you do, please do that in conjunction with the matching supervisor and CLI PRs that go with it.

So you are saying you won't discuss the change here until after I write the PR for supervisor and CLI? So if you provide feedback on this PR that requires a change then I will have to rewrite all 3 PRs... doesn't it make more sense to review the DNS change and confirm it is good before implementing the flag?

frenck commented 2 years ago

The dns part is the smallest part of the implementation and a fairly simple of statement handling the state of the flag (thus barely interesting or discussable imho) The other parts require way more implementation, code and tests.

fenichelar commented 2 years ago

The dns part is the smallest part of the implementation and a fairly simple of statement handling the state of the flag (thus barely interesting or discuss Anne imho) The other parts require way more implementation, code and tests.

But I don't know what change to make in supervisor until we discuss this. There is a high probability that using more than one feature flag is the best way forward. Or maybe you don't want to use multiple flags but instead use a string config option. I think you will understand once I submit a PR.

frenck commented 2 years ago

I don't think it's that difficult: use dns fallback or not. If the fallback is disabled only use the configured DNS servers... 🤷‍♂️ we don't need multiple flag for that.

fenichelar commented 2 years ago

I don't think it's that difficult: use dns fallback or not. If the fallback is disabled only use the configured DNS servers... 🤷‍♂️ we don't need multiple flash for that.

How should the following scenarios be handled?

DNS servers have been provided both manually and via DHCP.

  1. Use manually provided DNS servers only
  2. Use DHCP provided DNS servers only
  3. Use both manually provided DNS servers and DHCP provided DNS servers in the forward line (what order?)
  4. Use manually provided DNS servers in the forward line and DHCP in the fallback line
  5. Use DHCP provided DNS servers in the forward line and manually provided in the fallback line

And this is assuming we aren't including the Docker DNS server anywhere. Should the Docker DNS server be used?

fenichelar commented 2 years ago

Maybe instead of a single failover feature flag, having a dns_mode flag makes more sense. With the following options:

automatic - current behavior DHCP - use DHCP provided DNS servers only manual - use manually provided DNS servers only docker - use Docker DNS server only

frenck commented 2 years ago

I really don't think we should handle those cases at all from a flags perspective.

If the single flag is turning off fallback: In case manual server are provided use those, else use DHCP provided one.

fenichelar commented 2 years ago

If the single flag is turning off fallback: In case manual server are provided use those, else use DHCP provided one.

Ok. What about Docker DNS server? Only use Docker if no servers are provided manually or via DHCP?

frenck commented 2 years ago

Docker dns has no part in this, nor does it need to be. It will match the already available info, as the supervisor manages the network settings.

fenichelar commented 2 years ago

@frenck The case I am trying to handle is if no servers are provided manually and no servers are provided via DHCP? Should the forward line just contain dns://127.0.0.11?

itsdeltalol commented 2 years ago

What issue?

Multiple! The OS is aimed at the average Joe running this at home, with limited knowledge. We have had to deal with many many many different issues and network configurations (and misconfiguration) causing an absolute flood of issues. This has slowly (over time) resulted in what you see here.

Hey, this describes me pretty well, I have some really limited knowledge, probably don't even belong in this discussion, but I'm having trouble getting Home Assistant to recognize the .local hosts that raspberry pi custom builds tend to set up. I have like three of them and I'm really going to hate to have to re-do each card when that device powers up and gets a new IP. It'd be really good if there was a thing to handle that.

Zixim commented 2 years ago

"a thing to handle that".. well that would be DNS...except HAOS devs have decided to break it, for "reasons".