ComplianceCow / CAML

Continuous Audit Metrics Catalog
Other
4 stars 6 forks source link

Decide on the expression standard in CAML #41

Closed rajkrishnamurthy closed 2 years ago

rajkrishnamurthy commented 2 years ago

per discussion on 02/16/2022; can we have a standardized approach to expression similar to the approach we had taken for the frequency or period (as a crontab) attributes. This will ensure that all expressions are coded per standard. We need to adhere to the following:

  1. It is the responsibility of the provider to implement the metrics expression per the standard. CAML does not specify how the implementation should happen
  2. There is should be no ambiguity on the operators
  3. Implementing the expression securely is the responsibility of the provider
rajkrishnamurthy commented 2 years ago

I would recommend forming the expression per the operator standards in python. The expressions are an algebraic expression that can be evaluated to a boolean value or a number. We should currently limit it to the following [Refer (1)]

  1. Arithmetic operators
  2. Comparison operators
  3. Logical operators
  4. Membership operators
  5. Conditional (ternary) operator

We will not support the following:

  1. Assignment operators
  2. Identity operators
  3. Bitwise operators
  4. String operator

We will not also support any custom functions (including builtin) such as sum(), pow()

Should we write a markdown similar to (4)?

References: (1) https://www.w3schools.com/python/python_operators.asp (2) https://realpython.com/python-eval-function/ (3) https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Expressions_and_Operators (4) https://github.com/Knetic/govaluate/blob/master/MANUAL.md

pritikin commented 2 years ago

2/18 discussion

There is a distinction between collecting measures and then computing the metrics. Here we are focused on computing the metrics from measures that have already been collected. Our current metrics use relatively simple arithmetic operators (#1 from above). Currently we don't use the more complex 2-5 operators. We have areas where it is more complex. Such as:

SEF-06-M2 which uses "Slope represented as a percentage: A = SLOPE(triage times for security events, dates for security events) * 100"

Looking forward it may be necessary to include scoping issues into the metrics. What happens when a metric like TVM-03-M1 is applied to a complex system that has a few high priority system and hundreds/thousands of lesser priority systems. As we mature we expect the metric expressions to become more complex.

So for now the requirement is: Some basic requirements:

An additional requirement: this must be safe for un-sanitized inputs. e..g we can’t just use python ‘eval’ because doing so exposes everything in the python language instead of just the math expressions and variables we care about.

Consensus of the group is to move forward with this.