oasis-open / cti-pattern-matcher

OASIS TC Open Repository: Match STIX content against STIX patterns
https://github.com/oasis-open/cti-pattern-matcher
BSD 3-Clause "New" or "Revised" License
44 stars 20 forks source link

Matcher exits with zero, even when there is a FAIL. #40

Closed jkugler closed 7 years ago

jkugler commented 7 years ago
± stix2-matcher -p examples/pattern -f examples/good_obs.json 

PASS:  [ software:vendor = 'nginx' AND software:version = '1.13.2' ]
± echo $?
0
± stix2-matcher -p examples/pattern -f examples/bad_obs.json

FAIL:  [ software:vendor = 'nginx' AND software:version = '1.13.2' ]
± echo $?
0

Is this intentional? I realize there can be multiple patterns, and multiple matches, but shouldn't it exit with 1 if there are any fails?

gtback commented 7 years ago

Thanks, @jkugler, that's a good idea. The command line matcher works by handling many patterns, with a fixed set of STIX Cyber Observable data. IMO this is unrealistic for "normal" applications of pattern matching, where you have some set of patterns that you want to run over variable data, and report which observable(s) match which patterns.

Related, I don't think we should use "PASS" and "FAIL" (they are likely leftover from the pattern validator). More accurate would be "True"/"False" or "Matches"/"Does not match", and I'm not clear how those would map to exit codes.

We don't explicitly return 0, it's more a default of the Python runtime if you don't return something else. An error in the matcher should return a non-zero value.

@chisholm, what do you think?

chisholm commented 7 years ago

Yeah I feel like the commandline tool part is the part that I overhauled the least from its original incarnation. If the original dev came back and looked at the code, they probably wouldn't recognize anything except for the PASS/FAIL results and basic commandline functionality. I suspect by default, when the app isn't explicit about exit status, any "normal" program exit results in exit status 0; if an exception propagates out of the program, that results in non-zero exit status (and a stack trace).

I guess we have to decide what we want exit status to communicate. We could decide that:

But if people want to use the tool in a non-interactive, scriptable way to check whether a set of patterns matches some data, and rely on exit status, it also makes me think maybe we need a mode which switches off the usual output (which might otherwise get redirected to /dev/null anyway), and only produces an exit status to communicate results. E.g. a "quiet" mode?

jkugler commented 7 years ago

I did see an exit status of 2 when I didn't pass in args, so it appears the exit statuses are used already. I would vote for 0 is all normal, all PASS. 1 means something FAILed, with appropriate stdout output, 2 means program error, with appropriate errors printed to stderr.

To go along with that, I would suggest a -q option that only prints out FAIL/error messages, as the FAIL and error messages would be useful in a build pipeline. I suppose a -qq could mean no messages at all (if that's your thing). :)

chisholm commented 7 years ago

I did see an exit status of 2 when I didn't pass in args, so it appears the exit statuses are used already.

Hmm... that's interesting. Must be related to argparse. That may not be customizable. I guess some choices are already made for us.

gtback commented 7 years ago

Yes, that's coming from argparse, not our own code. But we can probably change that (if we wanted) by wrapping parse_args in a try/except.

My vote is to change "PASS" to "MATCH" and "FAIL" to "NO MATCH". Once we do that, we can decide what exit codes represent "some matches", "all matches", "no matches" (along the lines of what @chisholm has in his bulleted list). I also like the idea of -q and -qq flags.

@jkugler, I should have asked this earlier, but I'd be interested in hearing more about how you're using the matcher. It was originally written to work out some of the ambiguous parts of the STIX Patterning specification. Thanks!

jkugler commented 7 years ago

@gtback I was originally calling from a shell script with a pattern file and observation file, but then integrated it into a Python script via from stix2matcher import matcher and called matcher.match().

gtback commented 7 years ago

Thanks, @jkugler. That's certainly the way we envisioned it used. Is it ok for us to close this issue, then?

jkugler commented 7 years ago

@gtback Well, the original issue isn't resolved...even if I am importing and re-using code. :)

gtback commented 7 years ago

Fair enough. Stated another way, should "failure to match all patterns" return a non-zero status code, even if we change "PASS" to "MATCH" and "FAIL" to "NO MATCH"?

CC: @johnwunder, @clenk, @chisholm, @emmanvg, @rpiazza

jkugler commented 7 years ago

That seems reasonable to me. I'm do a PR.

jkugler commented 7 years ago

Oh, wait...I didn't implement the -q and -qq options...hold off on merging this. :)

jkugler commented 7 years ago

-q option added. Reviews welcome! :)

chisholm commented 7 years ago

Should this implement your behavioral choice, where errors produce exit status 2? I think if exceptions propagate out of the application, you get the default stack dump but an exit status of 1. You might have to intercept them to get what you want. E.g.

try:
    # Do stuff
except Exception:
    return_value = 2
    sys.excepthook(*sys.exc_info())

The excepthook stuff invokes the default behavior (stack dump) so the output looks the same.

jkugler commented 7 years ago

@chisholm Good point. I was thinking the arg parser would exit with 2, but you're right: other exceptions are raised in the code, and we should probably catch those and exit with 2.

jkugler commented 7 years ago

Done.

jkugler commented 7 years ago

Wait...I just realized something. The meaning of Pass/Fail and Match/no match is not the same. I'm going to have to change that. Pass means the pattern matched, but had the right value (i.e. the version was right). Fail means the pattern matched, and the value was wrong (wrong version). I'll have to think about that.

jkugler commented 7 years ago

After further reflection, I am leaving the code as-is. Fully understanding that a match would produce a zero exit value, and the fact that the match function returns true if there is a match, it would make sense in light of something like this:

if stix2-macher -p <pattern> -f <observation>; then
  # do something to fix the problem found by the match
else
  # do something else...or nothing
fi

This would match the if/then logic if one was directly calling the match function from one's own code.

So, ready for final review.

gtback commented 7 years ago

Fixed by #41