matanolabs / matano

Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS
https://matano.dev
Apache License 2.0
1.44k stars 97 forks source link

AWS Cost and Usage Reports #118

Open timoguin opened 1 year ago

timoguin commented 1 year ago

I'd like to ingest AWS Cost and Usage Reports (used to be called detailed billing) from S3. They can be delivered in CSV or Parquet/ORC formats.

Considerations

This source would likely be a bit different than others because it might not make sense to normalize the data to ECS. That might make it a good candidate for testing support for ingesting columnar formats.

For a large org these can get pretty big. I think our daily reports are around 20-30GB (in CSV format).

CUR already has good support for exporting into formats that can be easily utilized with Athena, Redshift, and/or QuickSight.

They're not security logs per se, but they are still an important contextual data point, and a place where anomalies can be detected. We have alarms setup to detect when spending spikes for the day (compared to a 30-day rolling average). Usually this means configuration changes, but it could also detect things like crypto-mining attacks.

References