opnsense / core

OPNsense GUI, API and systems backend
https://opnsense.org/
BSD 2-Clause "Simplified" License
3.34k stars 749 forks source link

Setting a Policy on abuse.ch ThreatFox set in Intrusion Detection never completes? #6520

Closed gctwnl closed 1 year ago

gctwnl commented 1 year ago

Describe the bug Using the abuse.ch ThreatFox ruleset makes Apply on Policies never finish

To Reproduce

  1. Enable abuse.ch ThreatFox in your set of rules to download in Administration/Download.
  2. Add a policy
  3. Hit Apply

Expected behavior Policy saved, Apply button returns to non-working status.

Actual behavior Button keeps showing 'working' status ad infinitum

OPNsense 22.10.2-amd64 FreeBSD 13.1-RELEASE-p7 OpenSSL 1.1.1t 7 Feb 2023

OPNsense-bot commented 1 year ago

Thank you for creating an issue. Since the ticket doesn't seem to be using one of our templates, we're marking this issue as low priority until further notice.

For more information about the policies for this repository, please read https://github.com/opnsense/plugins/blob/master/CONTRIBUTING.md for further details.

The easiest option to gain traction is to close this ticket and open a new one using one of our templates.

AdSchellevis commented 1 year ago

you likely need a faster machine, ThreatFox is a very large ruleset (>260.000 rules), which takes quite some time to parse. Looking at our code to parse the metadata from the rules doesn't show much room for improvement unfortunately.

gctwnl commented 1 year ago

Understood. I am using a 800 series Deciso machine (4-core with 8GB RAM and fast SSD) which I purchased as 'overkill' so I could do these kinds of things. But I do wonder if my observation is correct. Because maybe it was parsed (memory use goes up significantly) and it was just the button that never returned to 'normal'. 'Hours' seems strange here, if only because not all memory was used (about 70%) and CPU sank to a few %. Basically, the machine was idle but the button displayed 'busy'.

AdSchellevis commented 1 year ago

hmm, that's a bit odd. the 800 series should certainly be fast enough. could you try to execute the following command on a (ssh) console?

time /usr/local/opnsense/scripts/suricata/installRules.py 

At my end this takes quite some time, but certainly not over 60 seconds. (it could indeed be a gui glitch as well)

kulikov-a commented 1 year ago

@gctwnl it would be interesting to know what the browser dev console says when this happens. on a pretty slow vm this indeed takes more then a time-out limit (Timeout (120) executing : 'ids' restart in backend log) but at least Apply button stops 'spinning' after the status:"" response on reconfigure call

gctwnl commented 1 year ago

Sure, I can help out and try. Basically, that means first installing/adding ThreatFox and testing. Any special way (CLI?) you want me to install the ThreatFox rule set?

AdSchellevis commented 1 year ago

there are a couple of things you can look at here, but if the ruleset is installed, I would first check how long it takes to deploy the rules from a (ssh) console as suggested in my previous posting. The other angle as @kulikov-a suggested is to check the browser console for errors

gctwnl commented 1 year ago

time /usr/local/opnsense/scripts/suricata/installRules.py

# time /usr/local/opnsense/scripts/suricata/installRules.py
       53.94 real        52.96 user         0.85 sys

Memory usage op 77% (6GB van 8GB) ;-)

gctwnl commented 1 year ago

This is what happens when I hit Apply on Policy:

[Error] Failed to load resource: cannot parse response (reconfigure, line 0)


I disabled ThreatFox and did the Download&Update via the GUI, that is then a lot faster. Memory use goes down to 5GB. I did two more Download&Updates that brings the memory use back to 2.5GB.

Conclusion, suricata can handle it, but something goes wrong in the GUI. Not a real problem, then. Cosmetic (but that can be confusing to people like me who do not do this for a living ;-) )
AdSchellevis commented 1 year ago

@gctwnl I think the ThreatFox has grown quite a lot over time, but we should certainly be able to fix the gui "crash" here. Let me move this to core and fix the console error this week.

gctwnl commented 1 year ago

Good plan.

The confusions I experienced were a nasty combination of this 'gui crash', OPNsense UX of ET Telemetry Pro ('Save' button that ET Open doesn't have — I would really look at that one), the fact that you need the 'Telemetry special ET Open rule set' because ET Telemetry ships with 'empty' sets, and the fact that Policy is required after turning IPS on because the rules generally ship as 'alerts'. Quite a killing combo. I understand it now, but I can imagine I am not the only one who will experience this difficult learning curve.

gctwnl commented 1 year ago

Sorry, accidentally closed. Not my call.

kulikov-a commented 1 year ago

hm. if i may, im not sure that this is the gui bug only: if data is undefined because of cannot parse response (reconfigure, line 0), then it would be very useful to know (via network tab of dev console) what kind of unparseable response was received on reconfigure request (if I understand correctly, one of the configd actions returns something unexpected). and i think something useful should arrive in system\backend logs at this moment?

AdSchellevis commented 1 year ago

https://github.com/opnsense/core/commit/5280cb346bc40fe9dc221dcdae6c9eb4543de74e should prevent it crashing out, but as @kulikov-a suggested, it would still be interesting to know what is being returned here.

gctwnl commented 1 year ago

hm. if i may, im not sure that this is the gui bug only: if data is undefined because of cannot parse response (reconfigure, line 0), then it would be very useful to know (via network tab of dev console) what kind of unparseable response was received on reconfigure request (if I understand correctly, one of the configd actions returns something unexpected). and i think something useful should arrive in system\backend logs at this moment?

I'd be happy to check (isn't this easier for Ad who has access to much more than I have?). What exactly do you want? I have been using Safari on macOS with Develop turned on. I have a Network tab, but I have no idea how to get 'what kind of unparseable response was received on reconfigure request'. Basically, what I do is enable ThreatFox, then hit the Policy apply button.

kulikov-a commented 1 year ago

it would be greate, thanks. its should be something like https://jun711.github.io/web/how-to-inspect-network-request-and-response-headers-on-safari/ i think. when you hit Apply button, the 'reconfigure' request waiting for a response should be visible in network tab. after a while the response will be received. usually the response body for this kind of request contains {"status":"OK"} or {"status":""}. but if the error is reproduced, then the response may contain something that breaks the script, so the response cannot be parsed.

AdSchellevis commented 1 year ago

fixed in https://github.com/opnsense/core/commit/5280cb346bc40fe9dc221dcdae6c9eb4543de74e