opnsense / core

OPNsense GUI, API and systems backend
https://opnsense.org/
BSD 2-Clause "Simplified" License
3.27k stars 727 forks source link

DPinger fails to start on v6 when PPPoE connection resets #3693

Closed marjohn56 closed 4 years ago

marjohn56 commented 5 years ago

I've had this problem since way back when, not really a massive issue just an annoyance, even more so when I go on holiday and my email is then full of Monit alerts!

I've now created a little workaround for this issue using Monit to restart the offending dpinger instance. In order to do this I needed to 'echo' the dpinger commands to the /tmp folder, thus it was then easy to make Monit call that command and away it goes.

In dpinger.inc

` mwexec("/usr/local/bin/dpinger {$params} > /dev/null");

    /* Put the commands into /tmp for monit to use */
    file_put_contents("/tmp/dpinger_{$name}_cmd","/usr/local/bin/dpinger {$params} > /dev/null");
    @chmod("/tmp/dpinger_{$name}_cmd", 0755);

` I don't know whether you want to use this way of fixing the issue or look for a deeper fix.

nivek1612 commented 4 years ago

A really useful little bit of code this. Due to DSL resyncs every so often I'd find dpinger was down on the ipv6 gateway since adding this the problem is no more. Nice work @marjohn56

fichtner commented 4 years ago

We could go back to keeping it in foreground mode but actually backgrounding it: a9e05d5722d

Something weird inside dpinger for sure, but not yet there to actually look why it stalls itself.

marjohn56 commented 4 years ago

It won't start as there is no v6 there until the pppoe has come up. I suspect this is an edge case that we are seeing and probably only affects users of pppoe AND doing v6 monitoring.

However having these temp strings does open up opportunities for use with monit to fire up this and other processes that might be useful - maybe a sub folder in /tmp that contains the commands... just thinking out loud.

fichtner commented 4 years ago

Well, if we background it it can start whenever it's ready, doesn't it?

Which reminds me... did you see we killed "Directly send SOLICIT" option on master after a debugging session with a stale ISP connection? It should support both scenarios now likewise.

fichtner commented 4 years ago

PS: restarting should be trivial nowadays:

# pluginctl -s dpinger restart
marjohn56 commented 4 years ago

Well, if we background it it can start whenever it's ready, doesn't it?

Which reminds me... did you see we killed "Directly send SOLICIT" option on master after a debugging session with a state ISP connection? It should support both scenarios now likewise.

No I didn't. Beware of this as Sky UK specifically require that a solicit is sent before a dhcp is sent, it's a sneaky way of trying to prevent third party routers from being used.

Pray tell, What's a 'STATE' ISP connection?

marjohn56 commented 4 years ago

PS: restarting should be trivial nowadays:

# pluginctl -s dpinger restart

Which dpinger instance, or is that all of them?

Yup, that works, I'll change my monit command. :)

fichtner commented 4 years ago

That would be all of them at the moment. But it can be improved.

state = stale, sorry.. ISP was not sending router advertisements after initial client connect so the reconnect never executed.

SOLICIT is always sent unconditionally and router advertisements are served on the side. Best of both worlds.

marjohn56 commented 4 years ago

Cool.. OK sounds good. Question, the restart works, but from monit I still need to call a script with the pluginctl command or is there a simpler way?

At present I have this where the script itself has the command, could I run the command directly?

image

Greelan commented 3 years ago

@marjohn56 sorry to necro this issue but I was wondering if you figured out the answer to your query above about directly calling the pluginctl restart command in monit?

I too am finding dpinger failing to start on IPv6 whenever the WAN interface goes down/up - whether reboot, reload or ISP downtime. I am not on PPPoE though, so the issue is broader.

I'd like to set up an automatic restart in the simplest way possible.

Greelan commented 3 years ago

Just saw the related forum thread on this (https://forum.opnsense.org/index.php?topic=18745.0). I will give it a shot with the pluginctl command and report back. 😀

marjohn56 commented 3 years ago

It would be pluginctl -s dpinger restart or pluginctl -s dpinger start. Tried it and unless I'm doing something wrong it doesn't restart/start the v6 dpinger, it does start the v4 one though. Might be a clue as to why it doesn't pick up automatically to begin with.

Greelan commented 3 years ago

Yeah, thanks. I've just tried that (or more specifically /usr/local/sbin/pluginctl -s dpinger restart) as the Execute command, but monit seems to be throwing syntax errors. Hmmm

marjohn56 commented 3 years ago

Just call me script for now. I'll take a look at this next week if I have some time or maybe @fichtner will read this and take a look. As I said, it restarts the v4 instance, but for a reason I cannot fathom at short notice it's not kicking the v6 instance, Odd, because my script calls more or less the same function. The script is called dpinger_starter and lives in rc,d, Here it is, I cannot paste it but here's an image

image

Greelan commented 3 years ago

Doh, figured out why the syntax errors, rookie error - I was trying to run pluginctl directly rather than passing it to sh.

fichtner commented 3 years ago

Service restart framework looks for the first dpinger if the ID is omitted...

# pluginctl -s dpinger restart MY_GATEWAY

... restarts the correct one I guess ;)

Greelan commented 3 years ago

I see. So I have changed my execute command to /bin/sh -c '/usr/local/sbin/pluginctl -s dpinger restart WAN_DHCP6'. That look right?

Is "restart" the right command if dpinger is not actually running on IPv6, rather than just "start"? Or doesn't it really matter?

fichtner commented 3 years ago

Looks good. start or restart doesn't really matter in this context. I would expect restart to be a little more resilient as start normally will not reconfigure (it's running but broken, start won't touch it).

You can test via killall dpinger and see if only your IPv6 one restarts.

Greelan commented 3 years ago

Brilliant - it worked. :) Thanks for the input, @fichtner and @marjohn56.

Excuse my noob questions, but can I ask:

Thanks again

fichtner commented 3 years ago

I think internally a shell is opened to run the command so you can omit an explicit shell invoke. Not sure I understand the second question. For PHP services not using rc/rc.conf start and restart are the same and are only provided for symmetry with the services that use the rc framework exclusively to avoid custom PHP scripting (web proxy and intrusion detection do this for example).

Greelan commented 3 years ago

Thanks. monit won't let me apply the service test without /bin/sh -c. My question was more whether /usr/local/bin/php -c was more appropriate

marjohn56 commented 3 years ago

Service restart framework looks for the first dpinger if the ID is omitted...

# pluginctl -s dpinger restart MY_GATEWAY

... restarts the correct one I guess ;)

Can we modify it so it skips through and looks at all the dpinger instances, restarting any stopped instances automatically?

fichtner commented 3 years ago

Not from the shell. It requires PHP coding.

marjohn56 commented 3 years ago

Yes, that's what I mean, mod the services so it checks all the dpinger instances that should be running,

marjohn56 commented 3 years ago

All I call is dpinger_configure_do(), that starts/restarts all instances,

fichtner commented 3 years ago

But the service controls on the GUI won't, and pluginctl is using that same abstraction.

marjohn56 commented 3 years ago

fairy nuff.. still begs the question why it doesn't auto recover though. :)

fichtner commented 3 years ago

It does depend on what IP it tries to pick from the interace. We have the funny situation that autoconf is used on the WAN even when DHCP6 is used, so you get a prefix from DHCP6 but not an address, but dpinger won't be able to ping a global address from WAN until it gets an autoconf IPv6 from the router...

fichtner commented 3 years ago

(ideally with DHCPv6 prefix only we need to grab the IPv6 from tracking interface to make it more reliable)

marjohn56 commented 3 years ago

On irc now an pinging u