imays11 commented 1 year ago

Problem

I used the Kubernetes Audit Logs Integration to create a ruleset for detecting threat behaviors in Kubernetes using the API Server Audit Logs. As of right now that Integration can only access audit logs from local K8s environments who have access to the control plane. In order to get audit logs from Cloud managed K8s providers (AWS EKS, GCP GKE, Azure AKS), you have to utilize one of our cloud specific Integrations to gather those logs from the respective cloud platform's logging solution. In order to be searchable, those audit log events need to be further parsed because they come in as a large json blob in a single message field. In order to be aligned with our current Kubernetes ruleset, those fields need to be appended with kubernetes.audit to match the mapping from our Kubernetes Audit Logs Integration, which is what the ruleset was based off of.

I worked with @andrewkroh a while back to create this solution. It's specifically designed to ingest K8s audit logs from EKS via AWS Cloudwatch Integration and align the field mapping to the current Kubernetes Audit Logs Integration. The goal being to eventually do something similar to this across all the Cloud managed K8s platforms so that we can use 1 single ruleset for all K8s Audit Logs.

Request

Below is that documented solution for AWS, while simple in design, it's not very user friendly. My request is for a new datastream that would essentially do this for the user. This could be housed within the AWS Integration or the Kubernetes Integration. I'm not sure which is a better long-term solution as the goal is to do this for all the major cloud platforms K8s offerings. So we'd either need to create a datastream like this in each of those cloud platform Integrations, or create something like this inside the Kubernetes Integration itself that could allow for distinction between each cloud provider. With the upcoming launch of Defend for Containers (cc: @learhy), having a good solution to showcase K8s audit log ingestion via our current Kubernetes Integration and pre-built K8s audit log detection rules, for at least an EKS cluster, would be great timing since these are all important aspects of cloud workload protection.

Solution

Within the AWS Integration, setup Cloudwatch logs datastream to collect Log Group ARN containing EKS Audit Logs.
Place this processor, pulled from the kubernetes.audit_logs Integration, directly in the Cloudwatch Integration Policy.
```
- decode_json_fields:
fields: [message]
target: kubernetes.audit
```

Copied from the kubernetes.audit_logs integration.

https://github.com/elastic/integrations/blob/83b90b0eb7d86b8100f6ed5c976f93cf57c45aff/packages/kubernetes/data_stream/audit_logs/agent/stream/stream.yml.hbs#L15

drop_fields: when: has_fields: "kubernetes.audit.responseObject" fields: ["kubernetes.audit.responseObject.metadata"]
drop_fields: when: has_fields: "kubernetes.audit.requestObject" fields: ["kubernetes.audit.requestObject.metadata"]
script: lang: javascript id: dedot_annotations source: > function process(event) { var audit = event.Get("kubernetes.audit"); for (var annotation in audit["annotations"]) { var annotationdedoted = annotation.replace(/./g,'') event.Rename("kubernetes.audit.annotations."+annotation, "kubernetes.audit.annotations."+annotation_dedoted) } return event; } function test() { var event = process(new Event({ "kubernetes": { "audit": { "annotations": { "authorization.k8s.io/decision": "allow", "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"system:kube-scheduler\" of ClusterRole \"system:kube-scheduler\" to User \"system:kube-scheduler\"" } } } })); if (event.Get("kubernetes.audit.annotations.authorization_k8s_io/decision") !== "allow") { throw "expected kubernetes.audit.annotations.authorization_k8s_io/decision === allow"; } }
```
2. Ensure the dataset name is ```aws.cloudwatch_logs```, not ```generic``` or other custom dataset name...index matters here. 
3. View ingested audit logs that are aligned with our Kubernetes Audit Log Integration but ingested via CloudWatch Logs Integration
```

Screenshots Detailing Above Solution

0 aws_cloudwatch_eks_log_group

1 eks_to_k8s_audit_policy_processor

2 data_set_name_and_defaults

3 properly_ingested_audit_logs

elasticmachine commented 1 year ago

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

imays11 commented 1 year ago

Pinging @elastic/security-external-integrations (Team:Security-External Integrations) for an update on this request

leandrojmp commented 6 months ago

Hello, is there any update on this?

I'm on the same spot, just configured a Cloudwatch integration to get logs from EKS and was looking on how to parse the data, initially I was thinking into using the reroute processor to use the ingest pipeline for the Kubernetes integration, but found out that the parse does not use ingest pipelines and is done on the edge, so I would need to build the parse for it.

Having an integration with this would help.

leandrojmp commented 6 months ago

@imays11 and @andrewkroh

If I understood correctly this approach would take care of the parse of the json messages for EKS, but the data would still be stored in a cloudwatch data stream, which means that it would be necessary to create a custom template for this data.

Would be possible to use the reroute processor to maybe store the data on a Kubernetes data stream and avoid the need to create a custom mapping?

leandrojmp commented 5 months ago

We were able to use the reroute processor to store the data in the correct datastream, but some mapping are still missing.

I opened an issue about it: #10081

imays11 commented 5 months ago

We were able to use the reroute processor to store the data in the correct datastream, but some mapping are still missing.

I opened an issue about it: #10081

@leandrojmp Do you have your solution documented anywhere?

leandrojmp commented 5 months ago

@imays11 no, but I can explain what I did.

First I configured a Cloudwatch integration using the steps you shared and in the same integration I changed the namespace from default to a custom name like kubernetes-eks to be used later in filtering.

After that I edited the logs-aws.cloudwatch_logs@custom ingest pipeline and added the following processor.

    {
      "reroute": {
        "if": "ctx?.data_stream?.namespace == \"kubernetes-eks\"",
        "tag": "reroute-kubernetes-eks",
        "ignore_failure": true,
        "dataset": [
          "kubernetes.audit_logs"
        ],
        "namespace": [
          "kubernetes-eks"
        ]
      }
    }

This will reroute the documents to be processed by the kubernetes audit integration pipeline.

Since the source of the logs is cloudwatch, some fields do not have the correct mapping in the Kubernetes integration template, so I've created the logs-kubernetes.audit_logs@custom ingest pipeline and added the following processors:

{
  "processors": [
    {
      "drop": {
        "if": "ctx.message?.startsWith(\"I06\") || ctx.message?.startsWith(\"E06\") || ctx.message?.startsWith(\"W06\") || ctx.message?.startsWith(\"time\")",
        "ignore_failure": true
      }
    },
    {
      "dot_expander": {
        "field": "aws.cloudwatch",
        "override": true,
        "ignore_failure": true
      }
    },
    {
      "remove": {
        "field": "aws",
        "ignore_missing": true,
        "ignore_failure": true
      }
    },
    {
      "remove": {
        "field": "awscloudwatch",
        "ignore_missing": true,
        "ignore_failure": true
      }
    },
    {
      "remove": {
        "field": "event.id",
        "ignore_failure": true
      }
    },
    {
      "remove": {
        "field": "event.kind",
        "ignore_missing": true,
        "ignore_failure": true
      }
    },
    {
      "remove": {
        "field": "elastic_agent",
        "ignore_missing": true,
        "ignore_failure": true
      }
    },
    {
      "remove": {
        "field": "tags",
        "ignore_missing": true,
        "ignore_failure": true
      }
    }
  ]
}

The first processor will remove the non json messages that are created by eks, the dot_expander processor is required to be able to remove the aws.cloudwatch field, this is required because this field is not a json object but a field with a literal dot on it and the remove processor does not work in this case, this was reported here.

After that I have a series of remove processors to remove unmapped fields from the final document.

This way I can get the audit logs from an EKS cluster and store them in the logs-kubernetes.audit_logs-* data stream.

imays11 commented 5 months ago

@leandrojmp Oh wow, thank you for sharing that! Seems like a lot to expect from customers for a workaround. I hope someone can follow-up on this conversation and create a user friendly solution for either one or both of our use-cases. Someone wanting to parse managed K8s logs but keep them within the cloud specific datastream vs wanting to reroute them to the kubernetes.audit_logs datastream. I think your solution might make more sense for keeping things consistent across all the cloud platforms managed K8s solutions; but it might take more work to normalize fields across all the platforms.

ltflb-bgdi commented 4 months ago

@imays11 @leandrojmp I am working as well on ingesting EKS logs via Kinesis Firehose/CloudWatch and would like to see a more out of the box solution. Other than your specific request for audit log support, we plan to capture all available EKS logs ( API server, Audit, Authenticator, Controller manager, Scheduler).

First I would like to share some good news. Please note that the pipeline described in the solution above can now be greatly simplified, since the processors you copied from the kubernetes audit logs integration are now part of the ingest pipeline. See: https://github.com/elastic/integrations/pull/10138

Thus the remaining parts of the eks pipeline would be:

Parse json message
Reroute logs to corresponding kubernetes datastreams based on kubernetes api information
Do some homologation stuff like dot expander, drop some unneeded messages etc.

IMO this should be done in a to be created aws.eks_logs datastream. Other than most aws datastreams, it would only be used to parse and dispatch the logs instead of storing them. The main reason for this approach is that the data (at least in terms of audit logs) is just kubernetes and it does not make sense to duplicate the whole integration. In addition, since there are 5 different datasets, there would be 5 eks datastreams required as well. Last but not least, it seams that there are some predefined kubernetes alerting rules available, which would have to be ported to the eks (and google, Azure) integration as well.

Please share your feedback on this approach. I will start a POC with a eks datastream and create a pull-request if it works.

As well, it would be helpful if somebody of elastic could have a look at it and join the discussion.

narph commented 1 week ago

@imays11 , @Mikaayenson , I see the Security Service Integrations team has been linked to this issue? Can you clarify how the team needs to get involved here and what are the requirements?

elastic / integrations