theY4Kman / parsuricata

Parse Suricata rules
https://pypi.org/project/parsuricata/
MIT License
13 stars 3 forks source link

Multi value options are not parsed correctly #15

Closed asomsiko closed 12 months ago

asomsiko commented 1 year ago

Public Emergin Threats rule sid 2034270 for Suricata parse incorrectly. Grammar does not handle options with multiple value types (string and literal): reference:url,"vulnerability-lab.com/get_content.php?id=2295";

How reproduce:

  1. download https://rules.emergingthreats.net/open-nogpl/suricata-5.0/rules/emerging-malware.rules
  2. parse
  3. notice rule sid:2034270 failed to parse.

Rule sid:2034270 for reference:

alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"ET EXPLOIT PHP Melody v3.0 SQL Injection Attempt"; flow:established,to_server; http.method; content:"GET"; http.uri; content:".php?vid="; content:"|2d 27 20|AND|20 28|SELECT|20|"; content:"|28|SELECT|28|SLEEP|28 5b|SLEEPTIME|5d 29 29 29 2d 2d|"; distance:9; fast_pattern; reference:url,"vulnerability-lab.com/get_content.php?id=2295"; classtype:trojan-activity; sid:2034270; rev:1; metadata:affected_product Web_Server_Applications, attack_target Web_Server, created_at 2021_10_27, deployment Perimeter, former_category EXPLOIT, signature_severity Major, updated_at 2021_10_27;)
tlansec commented 1 year ago

Hey,

I noticed the same issue today and opted to bodge a fix. Ideally you'd want it to parse each value properly, but instead I just made it so that it at least recognizes things formatted in this way and added a basic test for it. Here's the diff:

index 8a6a68a..c41e6f6 100644
--- a/parsuricata/_parser.py
+++ b/parsuricata/_parser.py
@@ -85,9 +85,11 @@ grammar = r'''

     ?settings_expr: STRING  -> string
                   | LITERAL -> literal
+                  | MULTIVALUESTRING -> multivaluestring

     STRING: /"[^\r\n]+?"(?=\s*[;])/
     LITERAL: /(?!\s+)([^;\\"]|(?!\\)\\[;\\"])+(?=;)/
+    MULTIVALUESTRING: /[a-z]{1,},"[^\r\n]+?"(?=\s*[;])/
 '''

 parser = Lark(
diff --git a/parsuricata/rules.py b/parsuricata/rules.py
index fb87704..c01b1d8 100644
--- a/parsuricata/rules.py
+++ b/parsuricata/rules.py
@@ -81,8 +81,14 @@ class Literal(str):
         return str(self)

+class MultiValueString(str):
+    """A quoted string with a key value before it."""
+    def __repr__(self):
+        return str(self)
+
+
 class Setting(str):
-    orig_value: Union[String, Literal, str]
+    orig_value: Union[MultiValueString, String, Literal, str]

     def __new__(cls, value):
         o = str.__new__(cls, value)
diff --git a/parsuricata/transformer.py b/parsuricata/transformer.py
index b83f195..7622c27 100644
--- a/parsuricata/transformer.py
+++ b/parsuricata/transformer.py
@@ -15,6 +15,7 @@ from .rules import (
     Setting,
     String,
     Variable,
+    MultiValueString
 )

@@ -100,3 +101,6 @@ class RuleTransformer(InlineTransformer):

     def negated(self, value: Any):
         return Negated(value)
+    
+    def multivaluestring(self, tok: Token):
+        return MultiValueString(tok.value)
diff --git a/test_parsuricata.py b/test_parsuricata.py
index f39be70..cab127c 100644
--- a/test_parsuricata.py
+++ b/test_parsuricata.py
@@ -155,6 +155,16 @@ def test_spaces_at_ends_of_string():
     assert expected == actual

+def test_multi_value_string():
+    rules = parse_rules('''alert dns any any -> any any (reference:url,"www.google.com";)''')
+    expected = [
+        Rule('alert', 'dns', 'any', 'any', '->', 'any', 'any', [
+            Option('reference', Setting('url,"www.google.com"'))
+        ])
+    ]
+    assert expected == rules
+
+
 EXAMPLE_RULES_CORPUS = r'''
 alert ip [127.0.0.1, 127.0.0.2] any -> ![8.8.8.8/24, 1.1.1.1] any ( msg:"Test rule"; sid:12345678; rev:1; )
 alert ip any 80:100 -> any any ( msg:"Fart"; )
theY4Kman commented 12 months ago

I finally got around to implementing this. I decided to parse all comma-delimited settings into a special Settings tuple, so isinstance(option.settings, tuple) and isinstance(option.settings, parsuricata.Settings) will both be true if comma-delimited strings/literals were found.

I added the sid:2034270 ET rule to the tests, to ensure correctness. Now, reference:url,"vulnerability-lab.com/get_content.php?id=2295" would parse into Settings((Setting('url'), Setting('vulnerability-lab.com/get_content.php?id=2295'))), and could be used like option.settings[0] == 'url'.

Since this is technically a breaking change, but I've yet to rip off the 1.0.0 seal (which, admittedly, still is far from deserved), I've released this change to 0.4.0 (previously 0.3.3).

Holler if you still have issues/desires/grievances