microsoft / sarif-pattern-matcher

Quality domain agnostic regular expression pattern matcher that persists results to SARIF
MIT License
39 stars 17 forks source link

Adding ML inference support for false positive filtering #803

Closed suvamM closed 1 year ago

suvamM commented 1 year ago

Changes

This PR adds support in SARIF Pattern Matcher for leveraging custom machine learning models to filter out false positives. The PR

  1. Adds a new validation state, TruePositiveDeterminedByML
  2. Elevates the level of a finding to Error if the validation state is found to be TruePositiveDeterminedByML
  3. Adds a new helper function in the StaticValidatorBase which leverages ML models to validate a finding. This method vectorizes a string to an integer array, loads an ONNX model, and then obtains a prediction probability by running the input vector on the model. If the prediction probability exceeds a threshold, the secret is deemed to be valid.