wazuh / wazuh

Wazuh - The Open Source Security Platform. Unified XDR and SIEM protection for endpoints and cloud workloads.
https://wazuh.com/
Other
9.81k stars 1.54k forks source link

AWS Security Lake - Source Version 2 #22880

Open kclinden opened 3 months ago

kclinden commented 3 months ago
Affected integration
AWS

Description

When integrating with AWS Security Lake with Source Version 2 I am getting the following error:

DEBUG: Processing file aws/SH_FINDINGS/2.0/region=us-west-2/accountId=123456789000/eventDay=20240410/6cc7641e31d4d10209bed021dd7ebe35.gz.parquet in aws-security-data-lake-us-east-1-txlvnteculqdhzkmgk89pghsxqrlmu
DEBUG: +++ Error: Object of type datetime is not JSON serializable
Unknown error: Object of type datetime is not JSON serializable
Traceback (most recent call last):
  File "/var/ossec/wodles/aws/aws-s3.py", line 4250, in <module>
    main(sys.argv[1:])
  File "/var/ossec/wodles/aws/aws-s3.py", line 4237, in main
    asl_queue.sync_events()
  File "/var/ossec/wodles/aws/aws-s3.py", line 3797, in sync_events
    self.bucket_handler.process_file(message["route"])
  File "/var/ossec/wodles/aws/aws-s3.py", line 3667, in process_file
    events_in_file = self.obtain_logs(bucket=message_body['bucket_path'],
  File "/var/ossec/wodles/aws/aws-s3.py", line 3655, in obtain_logs
    events.append(json.dumps(j))
  File "/var/ossec/framework/python/lib/python3.9/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/var/ossec/framework/python/lib/python3.9/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/var/ossec/framework/python/lib/python3.9/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/var/ossec/framework/python/lib/python3.9/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type datetime is not JSON serializable
[root@wazuh-manager-0 logs]# vim /var/ossec/wodles/aws/aws-s3.py[root@wazuh-manager-0 logs]# vim /var/ossec/wodles/aws/aws-s3.py

Tasks

kclinden commented 3 months ago

Tried updating json.dump as below and the error went away. json.dumps(j, indent=4, sort_keys=True, default=str)

    def obtain_logs(self, bucket: str, log_path: str) -> List[str]:
        """Fetch a parquet file from a bucket and obtain a list of the events it contains.

        Parameters
        ----------
        bucket : str
            Bucket to get the file from.
        log_path : str
            Relative path of the file inside the bucket.

        Returns
        -------
        events : List[str]
            Events contained inside the parquet file.
        """
        debug(f'Processing file {log_path} in {bucket}', 2)
        events = []
        try:
            raw_parquet = io.BytesIO(self.client.get_object(Bucket=bucket, Key=log_path)['Body'].read())
        except Exception as e:
            debug(f'Could not get the parquet file {log_path} in {bucket}: {e}', 1)
            sys.exit(21)
        pfile = pq.ParquetFile(raw_parquet)
        for i in pfile.iter_batches():
            for j in i.to_pylist():
                events.append(json.dumps(j, indent=4, sort_keys=True, default=str))
        debug(f'Found {len(events)} events in file {log_path}', 2)
Selutario commented 3 months ago

Thank you for these reports @kclinden, we will review this.

fdalmaup commented 2 months ago

Issue Update

davidjiglesias commented 2 weeks ago

Although the fix works (tested by modifying the module in the 4.8.0 due to https://github.com/wazuh/wazuh/issues/23672 required to use the module in the master branch), we should review if the Security Lake rules should be modified or extended taking into account the new version.

Go ahead with the fix to ensure we are compatible with Source version 2 as well as maintain compatibility with Source version 1, ignoring ruleset for now.