Structure discarding short episode subsequences

Description

The break_into_subbehaviors function, which is responsible for cutting episode sequences into episode subsequences, discards short subsequences, for example:

This check occurs multiple times within the function and it is also present in generate_traces function. Ideally, it should be only in one place (in generate_traces).

Furthermore, when splitting, if a sequence goes like [low, low, medium, high, low], then [low, low, medium, high] is saved but the last [low] is just discarded. We probably shouldn't lose alerts like this. On the other hand, it is not clear what to do with a single event either. Maybe we keep them regardless?

Proposed solution

Leave only the check in the generate_traces function and update the break_into_subbehaviors function accordingly. The resulting attack graphs should be the same as before.
For now, these short episodes can be discarded, as they used to be, and it will be left to the user whether they want to keep them or not.

tudelft-cda-lab / SAGE

Structure discarding short episode subsequences #29

Description

Proposed solution