[Meta] Audit Rule Schemas Against Kibana Rule Schemas for Compatibility

terrancedejesus commented 7 months ago

Overview

While reviewing some rule schemas upstream in Kibana, we noticed inconsistencies where inherited objects between rule types are different than how we define them. Thus, validation may pass in our repository but break upstream. We should do an audit of Kibana's rule schemas to ours and add any missing or incorrect mappings.

TRaDE Rule Schema: https://github.com/elastic/detection-rules/blob/main/detection_rules/rule.py Kibana Rule Schema Reference: https://github.com/elastic/kibana/blob/main/x-pack/plugins/security_solution/server/lib/detection_engine/rule_schema/model/rule_schemas.ts

Example: Machine Learning rules inherit base rule data which has rule actions, where rule actions are not allowed in Kibana for ML rules.

SHolzhauer commented 7 months ago

I was going to raise an issue for something similar I noticed yesterday with the new_terms rules.

new_terms

We use the TOML definition standard and the python -m detection_rules test to validate our rules. However we use a custom python script to deploy our own rules (basically call the create/update apis).

When trying to create new_terms rules we noticed we have to "reformat" the json dump when converting from TOML to adhere to the api definition.

The TOML (and the python -m detection_rules test) expects this format:

[rule.new_terms]
field = "new_terms_fields"
value = ["user.name", "event.module"]
[[rule.new_terms.history_window_start]]
field = "history_window_start"
value = "now-30d"

But we need to do the following (snippet) to use it in our api calls:

def mutate_new_terms(r):
    """Mutate the json to correctly be formatted

    Args:
        r (dict): the rule
    """    
    new_rule = {}
    new_rule["metadata"] = r["metadata"]
    new_rule["rule"] = {}
    for k in r["rule"]:
        if k != "new_terms":
            new_rule["rule"][k] = r["rule"][k]
        else:
            new_rule["rule"]["new_terms_fields"] = r["rule"]["new_terms"]["value"]
            new_rule["rule"]["history_window_start"] = r["rule"]["new_terms"]["history_window_start"][0]["value"]

    return new_rule

t_rule = toml.loads(rule)

if t_rule["rule"]["type"] == "new_terms":
    t_rule = mutate_new_terms(t_rule)

with open(f"{rulefile}.json", "w+") as jsonf:
    jsonf.write(json.dumps(t_rule))

toml_rules.append(t_rule)

In essence we need to simplify the TOML to JSON format a bit on the new_terms fields.

For reference our api call code:

for r in toml_rules:
  requests.patch(
        url="https://<endpoint>/s/<space_name>/api/detection_engine/rules",
        data=json.dumps(r["rule"]),
        headers={
            "Content-Type": "application/json",
            "kbn-xsrf": "d466805d-aaaa-1111-1111-fe8ebe631af7"
        },
        params={
            "rule_id": r["rule"]["rule_id"]
        },
        auth=(kbnuser,kbnpwd),
        verify=False
    )

Investigation_fields

For this field something similar happens. The python -m detection_rules test starts complaining when we add the TOML definition for it (which is accepted by the api after json conversion).

[rule.investigation_fields]
field_names = ["event.category", "event.dataset", "event.action", "user.name"]

error from the test command:

tests\base.py:71: in setUp
    self.fail(f'Rule loader failure: \n{RULE_LOADER_FAIL_MSG}')
E   AssertionError: Rule loader failure: 
E   {'rule': [ValidationError({'language': ['Must be equal to eql.'], 'type': ['Must be equal to eql.'], 'investigation_fields': ['Unknown field.'], 'new_terms': ['Unknown field.']}), ValidationError({'language': ['Must be equal to esql.'], 'type': ['Must be equal to esql.'], 'investigation_fields': ['Unknown field.'], 'new_terms': ['Unknown field.'], 'index': ['Unknown field.']}), ValidationError({'type': ['Must be equal to threshold.'], 'threshold': ['Missing data for required field.'], 'investigation_fields': ['Unknown field.'], 'new_terms': ['Unknown field.']}), ValidationError({'threat_index': ['Missing data for required field.'], 'type': ['Must be equal to threat_match.'], 'threat_mapping': ['Missing data for required field.'], 'investigation_fields': ['Unknown field.'], 'new_terms': ['Unknown field.']}), ValidationError({'type': ['Must be equal to machine_learning.'], 'anomaly_threshold': ['Missing data for required field.'], 'machine_learning_job_id': ['Missing data for required field.'], 'investigation_fields': ['Unknown field.'], 'index': ['Unknown field.'], 'new_terms': ['Unknown field.'], 'query': ['Unknown field.'], 'language': ['Unknown field.']}), ValidationError({'type': ['Must be equal to 
query.'], 'investigation_fields': ['Unknown field.'], 'new_terms': ['Unknown field.']}), ValidationError({'investigation_fields': ['Unknown field.']})]}

But the api update "correctly" executes and the rule in kibana is updated:

SHolzhauer commented 7 months ago

Just realized the investigation_fields part is covered in #3224

Mikaayenson commented 3 months ago

Two issues will track the new terms and threshold bugs:

elastic / detection-rules