openwrt / packages

Community maintained packages for OpenWrt. Documentation for submitting pull requests is in CONTRIBUTING.md
GNU General Public License v2.0
3.97k stars 3.46k forks source link

isc-dhcp: dhcrelay4 needs restarting after boot #22708

Open brianjmurrell opened 10 months ago

brianjmurrell commented 10 months ago

Maintainer: @pprindeville Environment: x86_64 OpenWrt 23.05.0 r23497-6637af95aa

Description: I seem to have to restart the dhcrelay4 service after the router is booted in order for it to function. Prior to restarting, even though /usr/sbin/dhcrelay is running it is simply not actually doing any relaying until I run /etc/init.d/dhcrelay4 restart.

I suspect it might be getting started too early in the boot sequence and something it needs is not yet ready/up/etc. Although it's placement at S91 would make that a bit surprising.

brada4 commented 10 months ago

Migrate to kea. dhcp relay is discontinued upstream and will be removed soon. You can reload network services via hotplug scripts if they are slow to accomodate ip changes.

brianjmurrell commented 10 months ago

dhcp relay is discontinued upstream and will be removed soon.

Will it? @pprindeville Is that your plan?

@brada4 Thanks for your input but I'm perfectly aware of the status of the ISC DHCP server project however, having been a previous contributor to it.

However isc-dhcp-relay-ipv4 = 914kB and kea-dhcp4 = 281kB + kea-libs = 2364kB + libopenssl1.1 + log4cplus + boost + boost-system + etc. + etc.

Kea is not so friendly to embedded systems.

As long as @pprindeville is willing to provide isc-dhcp packages, I will continue to use them.

You can reload network services via hotplug scripts if they are slow to accomodate ip changes.

But the point is that I shouldn't have to. The package as supplied by OpenWRT should not need to have every user hack their installation due to this bug. This issue is to document the bug and hopefully bring the maintainer's attention to it in hope that he sees a quick and easy fix so that everyone using it doesn't need to apply their own manual hacks.

brada4 commented 10 months ago

eol tracker https://github.com/openwrt/packages/issues/20793 relay can be done with dnsmasq included by default

brada4 commented 10 months ago

maybe later start can help maybe not

pprindeville commented 10 months ago

dhcp relay is discontinued upstream and will be removed soon.

Will it? @pprindeville Is that your plan?

I have a rewrite of the current dhcpd.init that emits a Kea config file instead. It's blocked on a few dependency PR's like https://github.com/openwrt/openwrt/pull/13765 and https://github.com/openwrt/libubox/pull/6. Some of these PR's are more than 2 months old.

@brada4 Thanks for your input but I'm perfectly aware of the status of the ISC DHCP server project however, having been a previous contributor to it.

However isc-dhcp-relay-ipv4 = 914kB and kea-dhcp4 = 281kB + kea-libs = 2364kB + libopenssl1.1 + log4cplus + boost + boost-system + etc. + etc.

Kea is not so friendly to embedded systems.

But it does drop in nicely to SMB environments. It's more enterprise friendly.

As long as @pprindeville is willing to provide isc-dhcp packages, I will continue to use them.

I was hoping to have retired it by now, honestly, but getting the maintainers of the dependent PR's to merge (or even comment on the PR's) is not an expedient process. Sometimes it feels like we're drifting back into the dynamic that caused the LEDE schism...

You can reload network services via hotplug scripts if they are slow to accomodate ip changes.

But the point is that I shouldn't have to. The package as supplied by OpenWRT should not need to have every user hack their installation due to this bug. This issue is to document the bug and hopefully bring the maintainer's attention to it in hope that he sees a quick and easy fix so that everyone using it doesn't need to apply their own manual hacks.

Part of the issue is also that the maintainers can't always easily reproduce every usage case in a viable environment that's adequate for testing.

brianjmurrell commented 10 months ago

Part of the issue is also that the maintainers can't always easily reproduce every usage case in a viable environment that's adequate for testing.

That's completely fair. I'm happy to instrument and test in any way I can if you have any theories.

brada4 commented 10 months ago

Check stat order in luci/system/startup , you might want to start it later when interface IPs are stable configured.

brianjmurrell commented 10 months ago

As I stated in the original description it already starts pretty late at S91. The only startup scripts after it are:

lrwxrwxrwx    1 root     root            21 Oct  9 17:45 S94gpio_switch -> ../init.d/gpio_switch
lrwxrwxrwx    1 root     root            14 Oct  9 17:45 S95ddns -> ../init.d/ddns
lrwxrwxrwx    1 root     root            14 Oct  9 17:45 S95done -> ../init.d/done
lrwxrwxrwx    1 root     root            13 Oct  9 17:45 S96led -> ../init.d/led
lrwxrwxrwx    1 root     root            17 Oct  9 17:45 S98sysntpd -> ../init.d/sysntpd
lrwxrwxrwx    1 root     root            15 Oct  9 17:45 S99snmpd -> ../init.d/snmpd
lrwxrwxrwx    1 root     root            22 Oct  9 17:45 S99urandom_seed -> ../init.d/urandom_seed

The problem is likely that the interfaces are not brought up synchronously with the startup scripts and so can be [not] up at any random point in the startup script ordering.

That said, the two interfaces that dhcrelay is working on are both local network interfaces so they shouldn't have that much delay to coming up.

But indeed, perhaps starting dhcrelay needs to be a hotplug event, not a startup script. Does that seem plausable/reasonable @pprindeville? Granted that it needs to be brought up after two (or more!) interfaces are up makes it a bit more complicated. I'm not sure what facility would be available in a hotplug script to query if the other (than the interface for which the hotplug script is running) interfaces dhcrelay is operating on are up yet. Obviously such a hotplug script would only start dhcrelay on the last hotplug interface-up for all of the involved interfaces.

brada4 commented 10 months ago

kea does not include relay itself, you need to adapt dnsmasq as a replacement If interfaces are very slow to come up you can use rc.local like (sleep 42 ; service dhcrelay reload) &

brianjmurrell commented 10 months ago

Sleeps are hacky, racy and error-prone. Things should be event driven, not waiting and hoping for the best.

brada4 commented 10 months ago

migrate to dnsmasq, it is not year into EOL. dhcp-relay=upstream_ip is the needed option.

brada4 commented 10 months ago

Actually relay should have been dismissed long ago https://lists.isc.org/pipermail/dhcp-users/2021-June/022495.html

pprindeville commented 10 months ago

migrate to dnsmasq, it is not year into EOL. dhcp-relay=upstream_ip is the needed option.

The migration to Kea would be done by now if I could get approval for my changes here.

brianjmurrell commented 10 months ago

The migration to Kea would be done by now if I could get approval for my changes here.

But as mentioned before Kea is far far too big[1] for an embedded platform like most consumer-grade (i.e. all of the devices OpenWRT targets and supports) wireless routers. That's a huge amount of baggage just to get DHCP relaying services which is why I proposed enabling BusyBox's udhcprelay

[1] kea-dhcp4 = 281kB + kea-libs = 2364kB + libopenssl1.1 + log4cplus + boost + boost-system + etc. + etc.

brada4 commented 10 months ago

kea is not a replacement for dhcp-relayd. replacement is dnsmasq in base system.

brianjmurrell commented 10 months ago

dnsmasq is also very much bloated for just the purposes of DHCP relay, containing full DHCP client and a DNS server that will be unused.

udhcprelay is the optimal solution IMO.

brada4 commented 10 months ago

It is not well instrumented to isolate different functions of dnsmasq, like running only dhcp relay from it, even less to make trimmed-down relay only package, but give it a try.

brianjmurrell commented 10 months ago

This is all just mirroring the discussion in my udhcprelay enablement request so probably better to just catch up there and continue the discussion there.

udhcprelay is the optimal solution here just as udhcpc has been the primary OpenWRT DHCP client for a long time, despite ISC dhcp-client and potentially in the future the KEA DHCP client.

But again, the forum thread above is probably the right place to continue any discussion.

brada4 commented 10 months ago

Yes, present instrumentation is not ideal, though good luck.

brada4 commented 10 months ago

dnsmasq is also very much bloat

isc relay 8x bigger bloated

brianjmurrell commented 10 months ago

isc relay 8x bigger bloated

But is (soon) no longer even a choice, so a strawman argument at best, yes? Can we move on?

brada4 commented 10 months ago

client and relay are 2.5 years formally eol.

brianjmurrell commented 10 months ago

client and relay are 2.5 years formally eol.

Yes. You have said this already, at least once if not more. Nobody is arguing about it so why do you keep bringing it up? I have asked previously if we can move on. Can we please?

Given that an available option for DHCP Relay is going to be removed, it seems appropriate that another option (other than the bloated-for-the-purpose dnsmasq which needs all kinds of configuration to be stripped back to just being a DHCP relay) ought to be made available and BusyBox's udhcprelay seems like a good, single-purpose option.

All of this is waaay OT for this ticket. Further discussion should be at the link above.

pprindeville commented 10 months ago

The migration to Kea would be done by now if I could get approval for my changes here.

But as mentioned before Kea is far far too big[1] for an embedded platform like most consumer-grade (i.e. all of the devices OpenWRT targets and supports) wireless routers. That's a huge amount of baggage just to get DHCP relaying services which is why I proposed enabling BusyBox's udhcprelay

[1] kea-dhcp4 = 281kB + kea-libs = 2364kB + libopenssl1.1 + log4cplus + boost + boost-system + etc. + etc.

Based on what? I've run OpenWRT on Amazon AWS instances and Supermicro 1U servers with 8 cores and 128GB for DRAM... Repeating a statement doesn't make it any more true.

brianjmurrell commented 10 months ago

Based on what? I've run OpenWRT on Amazon AWS instances and Supermicro 1U servers with 8 cores and 128GB for DRAM...

I assume in the context of this thread you mean you have run the Kea DHCP server on such a system. And with no swap right? What else did you run on it?

Just because Kea (and perhsps nothing much else) will fit into a 128MB machine with no swap, doesn't mean it's not limiting to be using so much memory for such a small task as DHCP relay to the detriment of being able to run other things that could be using some of that wasted RAM.

But I think your point is missing the point. RAM is not the (primary) issue here, it's flash size. Adding such and so many big packages to a limited flash image size means having to sacrifice other things.

brada4 commented 10 months ago

dnsmasq does not permit via luci to disable dns part or run sub-instance for dhcp relay. But the dmsmasq itself takes features to enable as parameters defaulting to idling and doing nothing, you can start that relay besides dns cache, disabling dhcp in luci on involved int3tfaces.

brianjmurrell commented 10 months ago

All the more reason why https://forum.openwrt.org/t/enable-busybox-dhcp-relay/178500/5 is the better path forward.

brada4 commented 10 months ago

And what is conclusion from your assumptions there? dhcp relay is already supported, i dont see why one would repeatedly insist on growing busybox for everyone.

pprindeville commented 10 months ago

Based on what? I've run OpenWRT on Amazon AWS instances and Supermicro 1U servers with 8 cores and 128GB for DRAM...

I assume in the context of this thread you mean you have run the Kea DHCP server on such a system. And with no swap right? What else did you run on it?

No, right now I have ISC-DHCP running on this beast. I also have it running on a APU6 with 4GB and 4 cores. Well is running? lighttpd, collected, and a few other things...

I'd run Kea but I'm blocked on getting some other patches accepted that I need. See my posting within the last hour.

Just because Kea (and perhsps nothing much else) will fit into a 128MB machine with no swap, doesn't mean it's not limiting to be using so much memory for such a small task as DHCP relay to the detriment of being able to run other things that could be using some of that wasted RAM.

128GB, not 128MB. And I didn't say I needed all that memory to support DHCP; I don't. I'm saying not everyone runs OpenWRT on resource-limited machines. Most of my DRAM is used by collected and backend scripts that do real-time data-reduction on honeypots to analyze and categorize new threats in near real-time.

But I think your point is missing the point. RAM is not the (primary) issue here, it's flash size. Adding such and so many big packages to a limited flash image size means having to sacrifice other things.

... or you're missing the point that OpenWRT doesn't solely run on constrained systems.

pprindeville commented 10 months ago

And what is conclusion from your assumptions there? dhcp relay is already supported, i dont see why one would repeatedly insist on growing busybox for everyone.

And the intent at the core of busybox was always to have it be minimalistic.

brianjmurrell commented 10 months ago

128GB, not 128MB.

Oh. I missed that, I guess because I was missing your point entirely. But I think you are missing mine.

And I didn't say I needed all that memory to support DHCP; I don't. I'm saying not everyone runs OpenWRT on resource-limited machines.

Fair enough. But I'd take a very wild guess and say that the majority of people do run OpenWRT on resource-limited, embedded systems.

I don't think I ever said that Kea should not be packaged and available in OpenWRT for those that can and want to use it. My point was just that Kea is not a suitable (viable even perhaps) substitute for the ISC DHCP Relay which on the cusp of being removed from OpenWRT.

... or you're missing the point that OpenWRT doesn't solely run on constrained systems.

But again, in percentages, I would guess that most time it does and as such is in need for another DHCP relay tool.

brianjmurrell commented 10 months ago

And what is conclusion from your assumptions there? dhcp relay is already supported, i dont see why one would repeatedly insist on growing busybox for everyone.

And the intent at the core of busybox was always to have it be minimalistic.

I'm not a busybox core developer or even contributor so I cannot dispute that. But I might offer an alternative perspective for busybox and that's to provide as many tools in as small a footprint as possible.

There are lots of cases of OpenWRT enabled busybox applets that also have heavier-weight dedicated alternatives such as the several iptools packages as just one example. If a core principle of OpenWRT was to make busybox as small as possible by not enabling applets for which there are alternative packages available, the ip applet would most certainly not be available. Ditto for /bin/ps, and on and on.

brada4 commented 10 months ago

dnsmasq already contains superset of dhcrelay functionality. You can run dnsmasq --dhcp-relay=192.168.1.1,<relay-server-ip>[\#67][,<network name tag if not openwrt.lan#br-lan>] besides default dnsmasq, disabling its dhcp service on particular interface