matanolabs / matano

Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS
https://matano.dev
Apache License 2.0
1.44k stars 97 forks source link

S3 access log source fails transformation due to dependency on "name" config field #110

Open timoguin opened 1 year ago

timoguin commented 1 year ago

Problem

The managed log source type AWS_S3ACCESS fails if the name property in config isn't set to aws_s3access.

Take this example config:

name: aws_s3_access

managed:
  type: AWS_S3ACCESS

ingest:
  s3_source:
    bucket_name: my-s3-access-logs-bucket

This fails with lots of Failed to find avro schema errors in the transformer.

It only works if the config is changed to name: aws_s3access.

Expected Behavior

The name field should not be depended on for determining data schema in the transformer, since it is a human-facing identifier. These messages make it through the data batcher as expected, and they are tagged appropriately with the aws_s3access source type.

I expect them to be transformed appropriately.