opnsense / core

OPNsense GUI, API and systems backend
https://opnsense.org/
BSD 2-Clause "Simplified" License
3.33k stars 745 forks source link

BGP and "URL table" aliases do not load IPs after restoring system configuration from backup #7075

Closed ruslanbay closed 9 months ago

ruslanbay commented 10 months ago

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Steps To Reproduce

  1. Go to 'Firewall: Aliases'

  2. Create a BGP ASN alias image

  3. Create a "URL Table (IPs)" alias Refresh Frequency = 30 days Content = https://raw.githubusercontent.com/crazy-max/WindowsSpyBlocker/master/data/firewall/spy.txt image

  4. Save and apply changes

  5. Go to System: Configuration: Backups

  6. Download the encrypted configuration file

  7. Deploy a new opnsense instance

  8. Go to System: Configuration: Backups

  9. Restore the system configuration using the file from step 6

  10. Go to Firewall: Aliases

Expected behavior Applying the configuration file and restarting the firewall triggers an automatic alias update.

Actual behavior Aliases are empty. There are no relevant entries in the logs. For this reason, alias-related rules do not block traffic.

Workaround Set Refresh Frequency = 0.01 Hours, save and apply changes, then return the previouse value. It works for URL Table (IPs), but I don't know how to trigger update for the BGP aliases.

Environment OPNsense 23.7.10 (amd64)

ruslanbay commented 10 months ago
  1. Create a BGP ASN alias

Sorry, that was a bad example since AS198097 doesn't contain any IP addresses. You can use any other ASN to reproduce the issue.

AdSchellevis commented 10 months ago

bgp content is fetched on time interval (when /usr/local/share/bgp/asn.csv is too old or doesn't exist)

https://github.com/opnsense/core/blob/481859b41290300512b23c6fe98d619c6562134f/src/opnsense/scripts/filter/lib/alias/bgpasn.py#L38-L50

similar to geoip's

NateroniPizza commented 10 months ago

There needs to be a way to manually trigger this refresh, rather than waiting 24 hours before your services are back up (in my case, I lose control of my home automation system). Before it is suggested, no, the "Update and reload firewall aliases" cron job does not refresh the BGP ASNs, at least while in this error state.

Last time this happened, a few weeks ago, I was able to restore from a backup a second time after everything was up and running to re-trigger an update, but that doesn't seem to be working this time around.

Error in the log at the time indicates a name resolution error when it tried downloading the BGP lists, which is not unexpected given the WAN interface wasn't fully configured at the time.

EDIT: For anyone else stuck in this situation, you can temporarily modify the 86400 in the "_asn_ttl = (86400 - 90)" line in the bgpasn.py (located in /usr/local/opnsense/scripts/filter/lib/alias/bgpasn.py) to a low value (I changed it to 600, or 10 minutes), then reboot and wait a little while for it to force an update sooner. Once you've got the contents of the file, you can then increase the value back to normal. You can check whether it's been updated by view the last modified (or viewing the contents of) /user/loca/share/bgp/asn.csv.

EDIT2: Hmm... Doesn't seem to have actually updated the BGP lists within OPNSense for some reason - the "Loaded#" value is still empty under Firewall>Aliases for the BGP lists. Poking around to see if I can find some way to force it to use the asn.csv file contents.

EDIT3: Got a workaround in place. Export the Alias.json file from from Firewall>Alias (tiny button in the bottom-right), change "updatefreq" to "0.001" (or whatever value - 0.001 is around 86 seconds), and re-upload it. Give it a minute and a half, and you'll have the contents updated. Re-upload the original with whatever the default value is (I had one blank, and one 0.5, so don't know what the default is).

Here's a consolidated list of hoops to jump through to work around this issue (note that I do not know which of these are or are not necessary, particularly the reboots - don't know if the contents of the .py file is re-read automatically, or if a service restart or reboot is needed):

  1. Edit the following file, changing "86400" to some small value such as 300 or 600: /usr/local/opnsense/scripts/filter/lib/alias/bgpasn.py
  2. Reboot, and wait a little while, periodically checking on the following file to see whether its contents get updated: /user/loca/share/bgp/asn.csv
  3. Revert the above change, and reboot.
  4. In the OPNSense webGUI, go to Firewall>Aliases, and using the small button in the bottom-right, download the Alias.csv file.
  5. On all of the "type: asn" entries, change the "updatefreq" to 0.001 or something similarly small (that's 1.5 minutes, as it is 86400 * the value there), and save it as a new file. Upload this new file with the button next to the "download" described in the previous step, and click "apply" (don't know if the "apply" is necessary).
  6. Revert the settings by re-uploading the original alias.csv file, and clicking "apply."
AdSchellevis commented 10 months ago
rm /usr/local/share/bgp/asn.csv

?

ruslanbay commented 9 months ago

Hi @AdSchellevis Thank you for your help, but I'm a little confused. So, there is no easier way to trigger an alias update via WebUI? Or maybe it is possible to include/exclude the asn.csv file from backup?

AdSchellevis commented 9 months ago

@ruslanbay the file isn't in a backup and when not there, it should fetch it on first use. The one thing I can imagine is that it couldn't fetch it the first time (no internet) left an empty file and assumed it did collect it. If that's the case, we should probably be able to fix that. easy way to try if that's the issue is to replicate that scenario and take a peek at the file (when empty and refusing to fetch later, there's something to fix).

NateroniPizza commented 9 months ago

@ruslanbay the file isn't in a backup and when not there, it should fetch it on first use. The one thing I can imagine is that it couldn't fetch it the first time (no internet) left an empty file and assumed it did collect it. If that's the case, we should probably be able to fix that. easy way to try if that's the issue is to replicate that scenario and take a peek at the file (when empty and refusing to fetch later, there's something to fix).

That has been the case both times this has happened to me - an error is displayed in the system logs stating that a DNS resolution error has occured (due to no internet), and when I went to check the asn.csv file it was empty.

AdSchellevis commented 9 months ago

https://github.com/opnsense/core/commit/4c097be8ea1226728885b392e202939ec8723302 should fix the zero length file issue.

NateroniPizza commented 9 months ago

4c097be should fix the zero length file issue.

Sweet! Thank you.

@ruslanbay was having a similar issue with URL Table (IPs) as well, though there was a way to decrease the refresh interval that was able to get it refreshed (though this still required manual intervention). I don't use those, so haven't looked into what files store those, and wouldn't know what their state is after the restore.