STIXProject / specifications

DRAFT STIX specification documents for version 1.2
20 stars 7 forks source link

Stix Difficulties: Indicators/Observable, Composition/Object hierarchy is complicated #70

Open terrymacdonald opened 8 years ago

terrymacdonald commented 8 years ago

PROBLEM

When complicated Observable Patterns are received it is difficult for implementers to store, extract and use them. We need a way of simplifying the patterns themselves. Observable instances are fine, as they are always related to each other - therefore being an AND type relationship - which will make the creation of the new Sighting Object reasonably straightforward. Observable Patterns on the other hand are potentially looking for slight variations of particular observables, meaning that the patterns can be quite complicated to describe these variations.

As an example…the following Observable pattern indicates that this is a bad email:

This can be pretty complicated to describe, especially with the multiple different ways this can be described – Indicator Composition, Observable Composition or Related Objects.

POTENTIAL ANSWER

There must be a way of ‘flattening’ the Observable Composition/Indicator Composition hierarchy to still reflect the relationships but without a huge multi-level object within object within object structure. It could be as easy as using a Boolean expression as a string, using Object IDs, Boolean operators and nested parenthesis. But something that generates a smaller output size would be beneficial, as the observables and indicators are likely to be the most numerous, and would benefit most from a size reduction.

One possible way of achieving this comes as a by-product of enforcing reference only. We could have a single Indicator Boolean string that references the ID’s of the STIX Pattern (Observable Pattern) that are used in the Indicator. This would work as follows (in JSON Notation):

{“stix:Indicators” : [
  {
    “id”:"example:indicator-8cf9236f-1b96-493d-98be-0c1c1e8b62d7",
    “timestamp”:"2014-10-31T15:52:13.127931+00:00", 
    “version”:"2.1.1",
    “title”:”Malicious E-mail”,
    “type”:"Malicious E-mail”,
    “Condition”:"example:Observable-437f0c20-ab26-4400-9f6a-fc395da3ddd9 AND (example:Observable-437f0c20-ab26-4400-9f6a-fc395da3ddd9 OR example:Observable-437f0c20-ab26-4400-9f6a-fc395da3ddd9)",
    “confidence”:”High”
  }
]}

Please note the STIX Patterns (Observable Patterns) would be defined in their top-level Object array, and the Indicator Condition would be used rather than the top level Relationship instance. The reason for this is that the top-level Relationship object would not have the ability to describe the Condition properly.

It has been suggested that the current structure is actually very good for implementers, as it is reasonably straight-forward to parse that as an Abstract Syntax Tree. Ultimately the producer is trying to extract the Boolean logic in whatever form it takes, and parsing a Boolean string that contains or parsing a linked hierarchy of objects. Flattening these layers would still enable the AST to be derived from the Boolean condition.

In addition, we should look at whether Indicator_Composition and Related_Objects are actually required in STIX v2.0. Anecdotally it seems most people are using Observable_Compositions if they need to compose a more complicated Boolean pattern match. This may indicate that Indicator_Composition and Related_Objects are not required in STIX v2.0. Section 23 - “Which to use? Indicator Composition, Observable Composition, or referenced Object?” discusses this further.