This repository is a collection of community managed custom sources for Amazon Security Lake.
Customers can configure custom sources to bring their own security data into Security Lake. Enterprise security teams spend a significant amount of time discovering log sources in various formats, depending on the source, and correlating them for security analytics. Custom source configuration helps security teams centralize distributed and disparate log sources in the same format. Security data in Security Lake is centralized and normalized into the Open Cybersecurity Schema Framework (OCSF) and compressed in open source, column-oriented Apache Parquet format for storage optimization and query efficiency. Having log sources in a centralized location and in a singular format can significantly improve security teams’ timelines when performing security analytics. With Security Lake, customers retain full ownership of the security data stored in their account, with complete freedom of choice for analytics. Before we discuss creating custom sources in detail, it’s important to understand the OCSF core schema, which will aid you in mapping attributes and building out the transformation functions for the custom sources of your choice.
The OCSF project is a vendor-agnostic and open source standard that customers can use to address the complex and heterogeneous nature of security log collection and analysis. Customers can extend and adopt the OCSF core security schema in a range of use cases in their IT environment, application, or solution while complementing their existing security standards and processes. As of this writing, the most recent version of the schema is v1.1.0, and it contains six categories. These are System Activity, Findings, Identity and Access Management, Network Activity, Discovery, and Application Activity. Each category consists of different classes based on the type of activity, and each class has a unique class UID. For example, File System Activity has a class UID of 1001.
AWS Organizations is configured your AWS environment. AWS Organizations is an AWS account management service that provides account management and consolidated billing capabilities so you can consolidate multiple AWS accounts and manage them centrally.
Security Lake is activated and delegated administrator is configured.
Install AWS Serverless Application Model (SAM) command line interface (CLI). You will use AWS SAM CLI to deploy the infrastructure required to build the custom source ETL. Follow the AWS SAM development guide for steps to install the AWS SAM CLI.
The transformation function is a simple Lambda function that reads a mapping configuration file named OCSFmapping.json
. The mapping configuration is a json formatted file that captures log attribute mapping. the structure of the file is as below:
{
"custom_source_events": {
"source_name": "<custom-source-name>",
"matched_field": "<log-attribute-matcher>",
"timestamp": {
"field": "<timestamp-field-in-log>",
"format": "%Y-%m-%d %H:%M:%S.%f | epoch"
},
"ocsf_mapping": {
"<iterator>": {
}
....
}
}
}
custom_source_events
: Top level object key that the transformation function looks for in the configuration file.source_name
: Name of the custom source, this needs to be consistent across the solution deployment across various stages.matched_field
: The name of the field in the logs that is unique in case there are multiple types of logs shipped by the custom source.ocsf_mapping
: The top level object that has the OCSF mapping information.iterator
: The unique identifier, incase the software/product ships multiple types of logs. This iterator stores the mapping configuration for the log type.Static values in the configuration file
Some attributes in the OCSF framework especially metadata
objects can be mapped to static values.
For example:
"metadata": {
"profiles": "host",
"version": "v1.1.0",
"product" : {
"name": "System Monitor (Sysmon)",
"vendor_name": "Microsoft Sysinternals",
"version": "v15.0"
}
}
Derived values in the configuration file
Attributes derived from log data must be preceded with $. for the function to identify that the value of the attribute should be fetched from the log data.
For example:
"ip": "$.event.src_ip"
"port": "$.event.src_port"
Derived values in the configuration file with OCSF defined mapping
Certain attributes have specific mapping provided by the OCSF class. These attributes are identified by the transformation function using the enum
type. The enum type has two fields that the function uses to populate the appropriate value - evaluate
and values
.
The evaluate
field identifies the key in the log data, the value of which the function should use to map to the OCSF defined value.
The values
field maps the value from the log data to the values pre-defined by the OCSF class.
For example:
"activity_id": {
"enum": {
"evaluate": "$.EventId",
"values": {
"2": 6,
"11": 1,
"15": 1,
"24": 3,
"23": 4
},
"other": 99
}
}
To get started, navigate to the custom source readme documentation starting with CSx-
to find the custom source you would like to configure.
Follow the deployment steps in the custom source of your choice. The project supports two patterns for configuring custom sources depending on the raw log source identified by the parameter LogEventSource
:
KinesisDataStream
: Where Kinesis Data Streams is used for log streaming and ETL. Currently the supported custom sources under this pattern are:
S3Bucket
: Where customer managed S3 buckets are used to stage raw logs prior to transformation by the transformation Lambda function.
Currently the supported custom sources under this pattern are:
NOTE: If you don't find the custom source you are looking for, you can either submit an issue to add the custom source or you could contribute to code with a pull request.
After the log ingestion and transformation pipeline is configured and the crawler has created the Glue tables, you will need to configure access to the tables from the query tool in Lake Formation.
Use the link above to configure access to the data. For this solution you will choose the following options for the Lake Formation attributes:
The above configuration will give relevant permissions to the Principals specified.
See CONTRIBUTING for more information.