oasis-open / cti-pattern-validator

OASIS TC Open Repository: Validate patterns used to express cyber observable content in STIX Indicators
https://stix2-patterns.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
26 stars 23 forks source link

Errors parsing content from ` cti-stix-generator` #87

Open dganev-cb opened 2 years ago

dganev-cb commented 2 years ago

I am trying to generate Indicator objects from the generator, and it generates the indicators successfully however whenever I try to parse their patterns with the stix2patterns on certain indicators I got the following error:

stix2patterns.exceptions.ParseException: 1:145: mismatched input 't'2021-10-01T08:52:52.795715Z'' expecting StringLiteral

The generated Indicator causing the issue:

{
      "type": "indicator",
      "spec_version": "2.1",
      "id": "indicator--859073c6-4d82-43b2-98ff-60e2679e7690",
      "created": "2021-03-19T02:27:21.451622Z",
      "modified": "2021-05-16T14:01:24.372781Z",
      "description": "Pay foot out range.",
      "pattern": "([mac-addr:value > '8f:7c:83:49:57:83'] OR (([domain-name:value NOT = 'walker.org'] AND ([artifact:encryption_algorithm <= 'AES-256-GCM']) START t'2021-10-01T08:52:52.795715Z' STOP t'2022-03-21T19:45:37.539105Z'))) START t'2021-10-10T17:30:30.515273Z' STOP t'2023-02-28T05:27:18.193378Z'",
      "pattern_type": "stix",
      "pattern_version": "2.1",
      "valid_from": "2021-05-20T11:06:18.971499Z",
      "lang": "en"
    },
clenk commented 2 years ago

Hi @dganev-cb I just released a new version of the pattern validator which should fix it. Please let me know if it doesn't.

dganev-cb commented 2 years ago

Hi @clenk , It is happening agian with 2.0.0 version of the pattern validator, same error.

chisholm commented 2 years ago

Can you give a code snip to show what is failing? It does seem like a STIX 2.0 vs 2.1 error. I tried validating the pattern in your indicator, and it worked.

import stix2patterns.validator

patt = "([mac-addr:value > '8f:7c:83:49:57:83'] OR (([domain-name:value NOT = 'walker.org'] AND ([artifact:encryption_algorithm <= 'AES-256-GCM']) START t'2021-10-01T08:52:52.795715Z' STOP t'2022-03-21T19:45:37.539105Z'))) START t'2021-10-10T17:30:30.515273Z' STOP t'2023-02-28T05:27:18.193378Z'"

print(stix2patterns.validator.validate(patt, print_errs=True, stix_version="2.1"))

Output:

>python test.py
True
dganev-cb commented 2 years ago

Yes, I can confirm that the error is elsewhere. Thank you!

Traceback:

    Pattern(indicator.pattern).walk(stix_pattern_parser)
  File "<truncated>/stix/venv/lib/python3.8/site-packages/stix2patterns/v20/pattern.py", line 21, in __init__
    self.__parse_tree = self.__do_parse(pattern_str)
  File "<truncated>/stix/venv/lib/python3.8/site-packages/stix2patterns/v20/pattern.py", line 112, in __do_parse
    six.raise_from(ParseException(error_listener.error_message),
  File "<string>", line 3, in raise_from
stix2patterns.exceptions.ParseException: 1:60: mismatched input 't'2022-10-10T04:36:06.906343Z'' expecting StringLiteral

Edit: Didn't meant to close the Issue since its in the same package.

Context

I am extending the STIXPatternListener like so:


stix_pattern_parser = STIXPatternParser()
Pattern(indicator.pattern).walk(stix_pattern_parser)

###

class STIXPatternParser(STIXPatternListener):
    """STIXPatternListener extender for the custom parsing of the STIX Pattern."""

    def __init__(self) -> None:
        self.list_of_custom_objects = []

    def enterPropTestEqual(self, context) -> None:
        """Entering the properties of a STIX Pattern.

        Args:
            context: The STIX Pattern Context

        Returns:
            None
        """
        parts = [child.getText() for child in context.getChildren()]
        # Getting the parts which are:
        # [0]: The stix field type (eg. `ivp4-addr:value`)
        # [1]: Always `=` sign
        # [2]: The value inside single quotes (eg. `'127.0.0.1'`)
        if parts and len(parts) == 3:
            stix_field_type = parts[0]
            stix_field_value = parts[2]
            obj = <parsing_code>
            self.list_of_custom_objects.append(obj)

    def enterPattern(self, *args, **kwargs):
        self.list_of_custom_objects = []

This is probably a bad design but the idea was to initialize the STIXPatternParser and create objects out of the pattern in a case where there is a more complex pattern I wanted to create multiple custom objects. Does that extended class makes sense or should I use something else instead?

chisholm commented 2 years ago

In general, I think you do need to use the parse tree or AST when doing any kind of pattern processing which depends on pattern structure. Using e.g. regexes to pull things out of STIX patterns tends not to work very well. So I think it's a fine idea. The STIXPatternListener class is generated by ANTLR, just contains method stubs, and is intended to be subclassed (and similarly with the visitor).

I don't know what exactly you're trying to do, but subclassing the listener is certainly using it as designed.

dganev-cb commented 2 years ago

I thought it wasn't clear enough but what I am trying to do is as follows: Given this pattern [ipv4-addr:value = '203.0.113.1' OR ipv4-addr:value = '203.0.113.2']. I want to create 2 custom Objects out of it (since there are 2 ipv4 values => CustomObject<203.0.113.1>, CustomObject<203.0.113.2>. That's all.

chisholm commented 2 years ago

I think you have the right approach. Fyi, if needs are simple enough, maybe the inspector would give you what you need in a simpler way. The inspector pulls some summary information out of patterns and puts it into a simple data structure:

patt = "[ipv4-addr:value = '203.0.113.1' OR ipv4-addr:value = '203.0.113.2']"

p = stix2patterns.v21.pattern.Pattern(patt)
inspect_results = p.inspect()
pprint.pprint(inspect_results.comparisons)

results in:

{'ipv4-addr': [(['value'], '=', "'203.0.113.1'"),
               (['value'], '=', "'203.0.113.2'")]}

The inspector is itself written as a STIXPatternListener subclass. If it doesn't produce enough detail, you can write your own subclass, as you were doing.