dougbw / coredns_omada

CoreDNS plugin for TP-Link Omada SDN
Apache License 2.0
90 stars 9 forks source link

Clients not found by Omada #47

Open stordoff opened 5 days ago

stordoff commented 5 days ago

This seems to be an Omada bug, but I'm wondering if there's a way to fix it/a workaround with coredns_omada. Fairly frequently, wired hosts disappear from the Omada client list (seemingly those with not much traffic). When this happens, coredns_omada can't resolve the client:

Expected:

>nslookup alpine-docker.int.stordoff.com 10.0.0.5
Server:  alpine-docker.int.stordoff.com
Address:  10.0.0.5

Name:    alpine-docker.int.stordoff.com
Address:  10.0.0.5

Actual:

>nslookup alpine-docker.int.stordoff.com 10.0.0.5
Server:  UnKnown
Address:  10.0.0.5

*** UnKnown can't find alpine-docker.int.stordoff.com: Server failed

alpine-docker is the host when the coredns_omada Docker image is running, and it has IP address 10.0.0.5 (static DHCP lease). Once the client reappears in the client list, it can be resolved again.

dougbw commented 5 days ago

Hey, this was something that crossed my mind as currently every time the refresh is run a new set of records is created, which can cause records to be removed quite aggressively (I also have a particular host which seems to flap in and out of the Omada controller).

It would be a straightforward change to start the refresh from the existing set of records, but I would need to have a think about how to then cleanup stale records after a period of time, as keeping the records indefinitely could cause some incosistenties.

stordoff commented 5 days ago

Hi,

Thanks for replying. I don't know if it's possible or would cause other issues, but just keeping the DHCP reservations available indefinitely would resolve most of the issue for me. Any other hosts disappearing/reappearing are not a major concern (I assume I could add these reservations manually to coredns, but that defeats the purpose of keeping management within Omada).

dougbw commented 5 days ago

I could put the option to keep stale records behind an optional config property which would be quite a simple change. I will take a look the next time I am working on this.

dougbw commented 1 day ago

I have had a look at the code and am planning to make changes to support a couple of features.

First of all I will add resolution for DHCP reservations as managed by the Omada controller in the services page.

Secondly rather than starting from a fresh list of clients every refresh it will retain state across refresh intervals, whilst purging records once they exceed a configurable age (e.g 1 hour). This will make the refresh slightly less aggressive as I have had minor issues with records disappearing briefly if a client crashes/reboots/etc.

dougbw commented 23 hours ago

I have just published a beta release with support for resolving DHCP reservations and persisting stale client/device records across refreshes. By default stale records are purged after 5 minutes (although this can be extended via config).

I am going to do some more testing before promoting this release, but initially it looks to be working fine.

stordoff commented 22 hours ago

Thanks for that. I've pulled the beta Docker image (trying stale_record_duration 60m as a few of my clients disappear for extended periods) and it seems to be working fine for now. I'll let you know if I run into any issues.