Closed tbandixen closed 5 years ago
Somehow there are missing entries?
Feb 8 11:18:06 apu dhclient: Creating resolv.conf
Feb 8 11:18:06 apu opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 're1_vlan10'
Feb 8 11:18:06 apu opnsense: /usr/local/etc/rc.newwanip: On (IP address: X.X.X.X) (interface: WAN[wan]) (real interface: re1_vlan10).
Feb 8 11:18:06 apu kernel: stf0: changing name to 're1_vlan10_stf'
Feb 8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'wan'
Feb 8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to wan
Feb 8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb 8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv4 default route to Y.Y.Y.Y
Feb 8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: keeping current default gateway 'Y.Y.Y.Y'
Feb 8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv6 default route to Z.Z.Z.Z
Feb 8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: removing /tmp/re1_vlan10_stf_defaultgwv6
Feb 8 11:18:07 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: creating /tmp/re1_vlan10_stf_defaultgwv6 using 'Z.Z.Z.Z'
Feb 8 11:18:13 apu opnsense: /usr/local/etc/rc.newwanip: Resyncing OpenVPN instances for interface WAN.
Feb 8 11:18:13 apu kernel: ovpns2: link state changed to DOWN
Feb 8 11:18:20 apu kernel: ovpns2: link state changed to UP
Feb 8 11:18:21 apu opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'ovpns2'
Feb 8 11:18:21 apu opnsense: /usr/local/etc/rc.newwanip: Interface '' is disabled or empty, nothing to do.
Feb 8 11:18:28 apu opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS: updating cache file /var/cache/dyndns.org_0.cache: X.X.X.X
Feb 8 11:18:28 apu opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS: (Success) No change in IP address
I have a similar behaviour. After upgrade to 19.1 (using 19.1.1 now) I get this:
gif0: link state changed to UP
ovpns1: link state changed to DOWN
ovpns1: link state changed to UP
gif0: link state changed to DOWN
gif0: link state changed to UP
gif0: link state changed to DOWN
gif0: link state changed to UP
gif0: link state changed to DOWN
gif0: link state changed to UP
ovpns1: link state changed to DOWN
ovpns1: link state changed to UP
gif0: link state changed to DOWN
gif0: link state changed to UP
gif0: link state changed to DOWN
gif0: link state changed to UP
gif0: link state changed to DOWN
gif0: link state changed to UP
ovpns1: link state changed to DOWN
ovpns1: link state changed to UP
My SSL VPN connection breaks like every 5 minutes, even though I have regen set to 7200 seconds.
And my SSL VPN doesn't go over the GIF, it goes over WAN, so I think it's odd.
This is from my general log:
Feb 8 17:00:32 | opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS: (Error) Authentication failed
-- | --
Feb 8 17:00:29 | opnsense: /usr/local/etc/rc.filter_configure: Cannot switch while 0 inet6 gateways are up
Feb 8 17:00:29 | opnsense: /usr/local/etc/rc.filter_configure: ROUTING: keeping current default gateway '213.67.96.1'
Feb 8 17:00:29 | opnsense: /usr/local/etc/rc.newwanip: Interface '' is disabled or empty, nothing to do.
Feb 8 17:00:29 | opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'ovpns1'
Feb 8 17:00:29 | kernel: ovpns1: link state changed to UP
Feb 8 17:00:28 | opnsense: /usr/local/etc/rc.filter_configure: Cannot switch while 0 inet6 gateways are up
Feb 8 17:00:28 | opnsense: /usr/local/etc/rc.filter_configure: ROUTING: keeping current default gateway '213.67.96.1'
Feb 8 17:00:27 | kernel: ovpns1: link state changed to DOWN
Feb 8 17:00:27 | opnsense: /usr/local/etc/rc.newwanip: Resyncing OpenVPN instances for interface WAN.
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: Cannot switch while 0 inet6 gateways are up
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: Cannot switch while 0 inet gateways are up
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: Adding static route for monitor 9.9.9.9 via 90.232.8.233
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: Removing static route for monitor 9.9.9.9 via 90.232.8.233
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: skipping IPv6 default route
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: keeping current default gateway '213.67.96.1'
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv4 default route to 213.67.96.1
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to opt7
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'wan'
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: creating /tmp/gif0_defaultgwv6 using '2001:470:27:3d9::1'
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: removing /tmp/gif0_defaultgwv6
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv6 default route to 2001:470:27:3d9::1
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: skipping IPv4 default route
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to opt7
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'opt7'
Feb 8 17:00:26 | kernel: gif0: link state changed to UP
Feb 8 17:00:26 | kernel: gif0: link state changed to DOWN
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: Clearing states for stale opt7 route on gif0
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: creating /tmp/gif0_defaultgwv6 using '2001:470:27:3d9::1'
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: removing /tmp/gif0_defaultgwv6
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv6 default route to 2001:470:27:3d9::1
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: skipping IPv4 default route
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to opt7
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'opt7'
Feb 8 17:00:26 | kernel: gif0: link state changed to UP
Feb 8 17:00:26 | kernel: gif0: link state changed to DOWN
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: creating /tmp/gif0_defaultgwv6 using '2001:470:27:3d9::1'
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: removing /tmp/gif0_defaultgwv6
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv6 default route to 2001:470:27:3d9::1
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: skipping IPv4 default route
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to opt7
Feb 8 17:00:26 | opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'opt7'
Feb 8 17:00:26 | kernel: gif0: link state changed to UP
Feb 8 17:00:26 | kernel: gif0: link state changed to DOWN
```
After doing opnsense-patch c83bb8d
it's still up, and it's been 28 minutes now. Seems promising. :D
I'm wondering if its actually https://github.com/opnsense/core/commit/a1dbbb5e since that is new for 19.1 as well coming from FreeBSD directly but I don't really understand how it could cause such a disruptive behaviour...
# opnsense-revert opnsense
# opnsense-patch a1dbbb5e
A yes or no on that would be helpful. Thank you.
PS: There's also b20f71b to try but looks like low impact
I'm wondering if its actually a1dbbb5 since that is new for 19.1 as well coming from FreeBSD directly but I don't really understand how it could cause such a disruptive behaviour...
# opnsense-revert opnsense # opnsense-patch a1dbbb5e
A yes or no on that would be helpful. Thank you.
It looks promising, I cant say yes, but maybe. The TV stream doesnt drop anymore, but now I'm not connected throught VPN, but the logfile looks like this:
Feb 9 12:38:16 apu opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 're1_vlan10'
Feb 9 12:38:16 apu sshlockout[8895]: sshlockout/webConfigurator v3.0 starting up
Feb 9 12:38:16 apu opnsense: /usr/local/etc/rc.newwanip: On (IP address: X.X.X.X) (interface: WAN[wan]) (real interface: re1_vlan10).
Feb 9 12:38:17 apu kernel: stf0: changing name to 're1_vlan10_stf'
Feb 9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: entering configure using 'wan'
Feb 9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv6 default gateway set to wan
Feb 9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: IPv4 default gateway set to wan
Feb 9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv4 default route to Y.Y.Y.Y
Feb 9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: keeping current default gateway 'Y.Y.Y.Y'
Feb 9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: setting IPv6 default route to Z.Z.Z.Z
Feb 9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: removing /tmp/re1_vlan10_stf_defaultgwv6
Feb 9 12:38:17 apu opnsense: /usr/local/etc/rc.newwanip: ROUTING: creating /tmp/re1_vlan10_stf_defaultgwv6 using 'Z.Z.Z.Z'
Feb 9 12:38:24 apu opnsense: /usr/local/etc/rc.newwanip: Resyncing OpenVPN instances for interface WAN.
Feb 9 12:38:24 apu kernel: ovpns2: link state changed to DOWN
Feb 9 12:38:31 apu kernel: ovpns2: link state changed to UP
Feb 9 12:38:32 apu opnsense: /usr/local/etc/rc.newwanip: IP renewal is starting on 'ovpns2'
Feb 9 12:38:32 apu opnsense: /usr/local/etc/rc.newwanip: Interface '' is disabled or empty, nothing to do.
Feb 9 12:38:40 apu opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS: updating cache file /var/cache/dyndns.org_0.cache: X.X.X.X
Feb 9 12:38:40 apu opnsense: /usr/local/etc/rc.newwanip: Dynamic DNS: (Success) No change in IP address
It logs, that the VPN interface is down and up again, but I cant verify this. I can tell you on monday. But the wan interface didnt drop any connection. I didn't apply b20f71b, should I for your tests?
So I tested the VPN connection and unfortunately it drops. I apply b20f71b to test again, but I can't answer until tonight at the earliest.
The VPN connection drops also with the patch b20f71b applied. But WAN seams to be stable all the time.
My issues is fixed with patch c83bb8d
anyway. Still no issues.
c83bb8d is not going to be an official fix
@fichtner Why not if it's working?
We don't know what the error is, so it will come back if we just revert fixes for other problems. Not to mention having issues back that have been fixed with this. Does that make sense?
Totally
The VPN connection drops also with the patch b20f71b applied. But WAN seams to be stable all the time.
Since saturday WAN is stable (but the VPN interface drops) with the following patches applied
# opnsense-revert opnsense
# opnsense-patch a1dbbb5e
# opnsense-patch b20f71b
So what is the current state? Which patches should be applied? How can we help diagnosing the issue?
We have this on an APU3C4 with WAN IP provided by DHCP.
For now we disabled "Dynamic state reset | Reset all states when a dynamic IP address changes.This option flushes the entire state table on IPv4 address changes in dynamic setups to e.g. allow VoIP servers to re-register." which solved the problem with connections dropping and the IPSec Tunnel getting stuck because of dropped states. But the rc.newwanip script still gets called every 1200s.
I'm out of the loop on this one and have just picked it up to read through. A LONG time ago, well it seems a long time ago now we had a similar issue across the road with IPv6 dhcp6 running newwanipv6 on every renew of the client. This was the reason for adding the RENEW and REQEUST reasons to the output of dhcp6c, if the response from dhcp6c was RENEW then we stopped it from calling newwanipv6 in the script as nothing would have changed, I have not looked so I may be talking out of my derriere but is something similar not possible with dhclient? It would solve an awful lot of problems in one go.
I have the same problem with 19.1.1 on two APU2C4 systems. Another system running 19.1.1 on APU1D4 doesn't show this problem.
Is there anything I can help to diagnoses the issue?
I have the same issue and started looking around in the log files. I believe the answer lies in System: Gateways: Log File and possibly with dpinger.
I see this in the log file repeating every 15 minutes:
dpinger: send_interval 1000ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr xxx.xxx.xxx.xxx bind_addr xxx.xxx.xxx.xxx identifier "WAN_DHCP
I noticed that by going to Firewall: Settings: Advanced and checking the "Disable State Killing on Gateway Failure" option I was able to stop the routine dropping of connections.
I then noticed the following:
Feb 11 16:26:10 dpinger: GATEWAY ALARM: WAN_DHCP (Addr: xxx.xxx.xxx.xxx Alarm: 0 RTT: 66816ms RTTd: 213265ms Loss: 0%)
Feb 11 16:26:10 dpinger: WAN_DHCP xxx.xxx.xxx.xxx: Clear latency 66816us stddev 213265us loss 0%
Feb 11 16:25:58 dpinger: GATEWAY ALARM: WAN_DHCP (Addr: xxx.xxx.xxx.xxx Alarm: 1 RTT: 832201ms RTTd: 0ms Loss: 0%)
Feb 11 16:25:58 dpinger: WAN_DHCP xxx.xxx.xxx.xxx: Alarm latency 832201us stddev 0us loss 0%
Could this be the culprit?
what you see is dpinger reporting the very fact that there's a problem
I have the same problem with 19.1.1 on two APU2C4 systems. Another system running 19.1.1 on APU1D4 doesn't show this problem.
Are there any configuration difference between your systems on the APU2C4 and APU1D4 ?
I'm out of the loop on this one and have just picked it up to read through. A LONG time ago, well it seems a long time ago now we had a similar issue across the road with IPv6 dhcp6 running newwanipv6 on every renew of the client. This was the reason for adding the RENEW and REQEUST reasons to the output of dhcp6c, if the response from dhcp6c was RENEW then we stopped it from calling newwanipv6 in the script as nothing would have changed, I have not looked so I may be talking out of my derriere but is something similar not possible with dhclient? It would solve an awful lot of problems in one go.
Could this be the solution? Is someone willing to provide a patch that I can test?
No, we need to explain what happens first. Something changed, but the changes in the dhclient-script are too superficial to cause this so I'm afraid of later breakage if we don't pin this down.
I totally understand that. How can I provide further help?
Might be an idea to add some extra debug output to the dhclient-script and get @tbandixen to do some tests and give feedback. Unfortunately I'll not be able to add the extra bits myself until sometime next week!
@tbandixen - Little test for you. Copy the attached script to /usr/local/opnsense/scripts/interfaces . Backup the existing script first. See if it makes any difference at all. Pointers are needed to try and identify the issues so all this script does is filter out the RENEW response and bypass any actions. dhclient-script.zip
Ok, I ignore the renew now (as provided in the script) lets see if the vpn is stable now. will report asap
Well, as expected, the connection remains stable. I will include the ignored lines step by step to identify the causing line.
How can I manually trigger this script so I dont have to wait 20minutes every time?
It's more a question of if the 'RENEW' Bypass helps, what has changed elsewhere that's causing the issue. Has something changed in dhclient itself perhaps? @fichtner can answer that one.
I think just bypassing the RENEW isnt that good, its there for a reason. But I dont know if behind the scenes something changed (BSD things maybe, I dont know the *nix subsystem at all). I will try to find the line that causes the drop and digg a bit further.
Actually bypassing RENEW is good, it means nothing has changed so do nothing.
The key is that the var 'changes' is changed to 'yes'. What I would do next is add a logger entry to each and every if... fi statement block to identify the culprit.
Something like this:-
if [ "$old_ip_address" != "$new_ip_address" ]; then
$LOGGER "old ip address != new ip address at line 348"
delete_old_states
fi
I think we should check if $alias_ip_address
is set, otherwise if [ "$old_ip_address" != "$alias_ip_address" ]; then
and if [ "$new_ip_address" != "$alias_ip_address" ]; then
will be true always. My $alias_ip_address
is empty... Maybe that changed?
I think setting changes="yes"
because of [ "$old_ip_address" != "$alias_ip_address" ]
is wrong. Because in delete_old_alias
the alias will be removed IF $alias_ip_address
its length is not zero, so maybe there is no change at all if the alias is empty, right? Same applies to [ "$new_ip_address" != "$alias_ip_address" ]
My VPN connection remains stable if I add [ -n "$alias_ip_address" ] &&
to the lines 351 and 369.
So far so good, but did that change in 19.1? I didn't change any config targeting ISP dhcp client settings.
@tbandixen thanks for the hint, the check changed as it tried to align with FreeBSD. It seems the issue is present there, but it is not critical because it doesn't have a use case maybe shrugs
# opnsense-revert opnsense
# opnsense-patch 90c0c395
That should be it then....
historic reference: https://github.com/pfsense/pfsense/commit/d0d7f09ab3853b
Its all stable now 😃
Thank you very much!
Ok, we'll wrap this up for 19.1.2. Thank you for the help. ❤️
Just want to be sure the dev are aware of "our" issue.
After 10-20 minutes of uptime all incoming connections are beeing dropped! So, OpenVPN tunnels are dropped too. It was fine on 18.7.10.
In the General-system log I can see this every 20 minutes:
after every execution of rc.newwanip (even manually) all connection are dropped.
I watch TV over the internet and every 20 minutes the stream hangs, so I have to rewind to build up the stream again.
Here is the forum post: https://forum.opnsense.org/index.php?topic=11456.0
Is there any solution? What could it be? According to github, the last changes to rc.newwanip where 5 month ago (https://github.com/opnsense/core/commits/master/src/etc/rc.newwanip). I think it has something to do with the switch to HardenedBSD, but I am absolutly not a unix guy...