sysdiglabs / falco-aws-firelens-integration

Apache License 2.0
13 stars 10 forks source link

Are there updates for Falco/ AWS FireLens integration configuration files? #3

Open dzilbermanvmw opened 1 year ago

dzilbermanvmw commented 1 year ago

Hello,

I have created an issue for an integration with Falco for ECS recently, facing a similar issue with integration of Falco with AWS FireLens for EKS. The K8s config files for Firelens located in https://github.com/sysdiglabs/falco-aws-firelens-integration/tree/master/eks/fluent-bit/kubernetes definitely need update:

  1. The service-account.yaml needs update of API versions here: apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRole
  2. The daemonset.yaml file needs update in the image version: `containers:
    • name: aws-for-fluent-bit image: amazon/aws-for-fluent-bit:latest` since we'd like to run the latest version of that container image
  3. The configmap.yaml ConfigMap file needs to have a [FILTER] section needed by Falco for formatting of event entries (see attached) :
[FILTER]
        Name              aws
        Match             *
        imds_version      v1
        az    true
        ec2_instance_id   true
        ec2_instance_type true
        private_ip        true
        ami_id            true
        account_id        true
        hostname          true
        vpc_id            true

Also, possibly due to configuration in configmap.yaml, the resulting Falco log entries in CloudWatch falco log group have "extra" data in the "log" field:

{ "account_id": "133776528597", "ami_id": "ami-0d8857ce76f65c24d", "az": "us-west-1c", "ec2_instance_id": "i-0a1624d2e71fb1654", "ec2_instance_type": "m5.2xlarge", "hostname": "ip-10-0-11-242.us-west-1.compute.internal", "log": "2023-02-15T22:34:51.821078371Z stdout F {\"hostname\":\"falco-ctj6p\",\"output\":\"21:36:38.999177750: Warning a shell configuration file has been modified (user=<NA> user_loginuid=-1 command=containerd pid=3428 pcmdline=systemd --switched-root --system --deserialize 21 file=/var/lib/containerd/tmpmounts/containerd-mount2138293862/etc/skel/.bash_profile container_id=host image=<NA>) k8s.ns=<NA> k8s.pod=<NA> container=host\",\"priority\":\"Warning\",\"rule\":\"Modify Shell Configuration File\",\"source\":\"syscall\",\"tags\":[\"file\",\"mitre_persistence\"],\"time\":\"2023-02-15T21:36:38.999177750Z\", \"output_fields\": {\"container.id\":\"host\",\"container.image.repository\":null,\"evt.time\":1676496998999177750,\"fd.name\":\"/var/lib/containerd/tmpmounts/containerd-mount2138293862/etc/skel/.bash_profile\",\"k8s.ns.name\":null,\"k8s.pod.name\":null,\"proc.cmdline\":\"containerd\",\"proc.pcmdline\":\"systemd --switched-root --system --deserialize 21\",\"proc.pid\":3428,\"user.loginuid\":-1,\"user.name\":\"<NA>\"}}", "private_ip": "10.0.11.242", "vpc_id": "vpc-069155af741792b14" }

The date/time stamp before opening "{" of log and its formatting prevent programmatic JSON parsing of that log which I am implementing. I have noticed that an earlier blog on the subject https://aws.amazon.com/blogs/containers/implementing-runtime-security-in-amazon-eks-using-cncf-falco/ the "log" entries are JSON formatted.

Can you please advise on validity of suggested updates above and what configuration changes can be made to make "log" entries to be JSON formatted & parseable?

thank you, Dan Zilberman AWS Sr. SA firelens_falco_configMap_yaml

Issif commented 1 year ago

Hi @dzilbermanvmw,

Falco is not in fault in this situation, the stdout F pattern you see in all all logs is added by cri-o, it's its log format and others are struggling with it. See this issue.

I don't know why they chose that format, but it breaks log lines parsing when they are json (eg: Falco).

Microsoft (and others) faced that exact same issue and proposed a parser for fluent-bit with a specific regex: https://github.com/microsoft/fluentbit-containerd-cri-o-json-log

pauljflo commented 1 year ago

HI,

I use this configuration for FluentBit and it is formatting in cloudwatch correctly:

# Grep Filter drops logs that are only whitespace.
additionalFilters: |
    [FILTER]
        Name  grep
        Match *
        Regex $log (.|\s)*\S(.|\s)*
    [FILTER]
        Name parser
        Match *
        Key_name log
        Parser falco
    [FILTER]
        Name              aws
        Match             *
        imds_version      v1
        az                true
        ec2_instance_id   true
        ec2_instance_type true
        private_ip        true
        ami_id            true
        account_id        true
        hostname          true
        vpc_id            true

additionalInputs: |
    [INPUT]
        Name              tail
        Tag               falco.*
        Path              /var/log/containers/falco*.log
        DB                /var/log/flb_falco.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10

additionalOutputs: |
    [OUTPUT]
        Name cloudwatch
        Match falco.**
        region eu-west-2
        log_group_name falco
        log_stream_name alerts
        auto_create_group true

service:
  extraParsers: |
    [PARSER]
        Name      falco
        Format    Regex
        Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>P|F) (?<log>.*)$
        Time_Key  time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   Off
        # Command   |  Decoder | Field | Optional Action
        # =============|==================|=================
        Decode_Field_As   json    log