opnsense / core

OPNsense GUI, API and systems backend
https://opnsense.org/
BSD 2-Clause "Simplified" License
3.22k stars 718 forks source link

WAN does not initialize correctly after hard reboot due to power loss while behind ISP modem. #7068

Closed MaxMacNeill-UNB closed 3 months ago

MaxMacNeill-UNB commented 9 months ago

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Apologies in advance if this is a duplicate, but I couldn't find anything that addressed my issue.

Describe the bug

I recently observed that after a hard reset of the router (read - power loss) If opnsense comes up before the ISP modem does (it always does) the system will provide full Internet for about 40 seconds, then refuse to talk to anything on WAN afterwards. Lan works fine, and rebooting opnsense fixes it. There's no difference in wan IP. As of right now, I can find no reason for this behavior, but it persists in the current version. No knowledge of previous versions, as I have not experienced power loss like this before.

To Reproduce

Steps to reproduce the behavior:

  1. Connect OPNSense behind an ISP modem (I have a Bell Aliant Gigahub) in DMZ.
  2. Put them both on the same breaker
  3. Flip the breaker off and back on, observe as both boot correctly but opnsense does not provide Internet to clients.
  4. Reboot opnsense and observe it suddenly work.

Expected behavior

Power loss should be handled cleanly without manual intervention.

Describe alternatives you considered

I have looked into setting up a script to delay boot, but this would cause issues with efficiency in non power loss reboots, and even then would be a hacky workaround at best.

Environment

Software version used and hardware type if relevant, e.g.:

OPNsense 23.7.9 (amd64). Intel® Core I5-10500 Network SuperMicro AOC-SG-i2

fichtner commented 9 months ago

When that happens you are likely getting a WAN IP from another range. In those cases it is advisable to configure WAN DHCP to reject leases from the server in that problematic range and then DHCP waits for the modem to properly initialise.

Cheers, Franco

MaxMacNeill-UNB commented 9 months ago

I only wish this were the solution, but this is not the case - My correct WAN IP is 192.168.2.11 - as I have confirmed that when the network is not working it has the proper IP. My ISP modem is setup in DMZ but has the capabilities of a router - it's a static DHCP lease with all traffic forwarded, and it loads correctly. The issue is elsewhere.

MaxMacNeill-UNB commented 9 months ago

To clarify, this is a double nat, not a CG-Nat. There is no way to bypass the ISP router without dropping a few hundred dollars on specialty hardware, as the (Serialized and containing authentication hardware) SFP is soldered to the mainboard.

mimugmail commented 9 months ago

Can you set up a ping from internal to outside, watch system latest.log and post the message when ping stops?

MaxMacNeill-UNB commented 9 months ago

It will be difficult to time, as in order to trigger the bug I'd have to hard reset opnsense, then login, configure a ping, and open logs all before the error is triggered, but I will try come morning (it's about 10 minutes to midnight here.)

mimugmail commented 9 months ago

You can ping on your local machine and have an eye on your clock, then compare afterwards

MaxMacNeill-UNB commented 9 months ago

I see nothing of issue in the logs, but maybe you'll do better than I did - here's the relevant timestamps.

Power on - 11:44:16 OPNSense beep script - 11:45:33 Network up on client, ping starts working - 11:45:50 Ping goes to Destination unreachable - 11:46:30

latest.log

MaxMacNeill-UNB commented 8 months ago

Hey - just wanted to check in if there's any other information you folks need other than that log? I pulled it from the system folder, I assume it's the right one.

Maxwelldoug commented 7 months ago

Hey, just following up (from my main account rather than school, didn't even realize that was logged in, oops) - that issue still persists as of the latest version of Opnsense.

MaxMacNeill-UNB commented 5 months ago

Updating mid-March to prevent stale issue closure. this persists.

OPNsense-bot commented 3 months ago

This issue has been automatically timed-out (after 180 days of inactivity).

For more information about the policies for this repository, please read https://github.com/opnsense/core/blob/master/CONTRIBUTING.md for further details.

If someone wants to step up and work on this issue, just let us know, so we can reopen the issue and assign an owner to it.