falcosecurity / charts

Community managed Helm charts for running Falco with Kubernetes
Apache License 2.0
238 stars 286 forks source link

Falco on GKE/COS - Unable to download Kernel headers #134

Closed nhuray closed 3 years ago

nhuray commented 3 years ago

Describe the bug

Using official Helm chart and enabling eBPF on a GKE cluster, the Kernel headers cant't be downloaded because they are not found in https://dl.bintray.com/falcosecurity/driver

How to reproduce it

Install Falco using helm chart on a GKE cluster (1.16.13-gke.401) running COS instances and enable ebpf as defined in the documentation.

Expected behaviour

falco-driver-loader should download the kernel headers and falco services should start properly.

Current behaviour

The service failed to start because the kernel header can't be found on https://dl.bintray.com/falcosecurity/driver

* Setting up /usr/src links from host
* Running falco-driver-loader with: driver=module, compile=yes, download=yes
* Unloading falco module, if present
* Trying to dkms install falco module with GCC /usr/bin/gcc
DIRECTIVE: MAKE="'/tmp/falco-dkms-make'"
* Running dkms build failed, couldn't find /var/lib/dkms/falco/2aa88dcf6243982697811df4c1b484bcbe9488a2/build/make.log (with GCC /usr/bin/gcc)
* Trying to dkms install falco module with GCC /usr/bin/gcc-8
DIRECTIVE: MAKE="'/tmp/falco-dkms-make'"
* Running dkms build failed, couldn't find /var/lib/dkms/falco/2aa88dcf6243982697811df4c1b484bcbe9488a2/build/make.log (with GCC /usr/bin/gcc-8)
* Trying to dkms install falco module with GCC /usr/bin/gcc-6
DIRECTIVE: MAKE="'/tmp/falco-dkms-make'"
* Running dkms build failed, couldn't find /var/lib/dkms/falco/2aa88dcf6243982697811df4c1b484bcbe9488a2/build/make.log (with GCC /usr/bin/gcc-6)
* Trying to dkms install falco module with GCC /usr/bin/gcc-5
DIRECTIVE: MAKE="'/tmp/falco-dkms-make'"
* Running dkms build failed, couldn't find /var/lib/dkms/falco/2aa88dcf6243982697811df4c1b484bcbe9488a2/build/make.log (with GCC /usr/bin/gcc-5)
* Trying to load a system falco driver, if present
* Trying to find locally a prebuilt falco module for kernel 4.19.112+, if present
* Trying to download prebuilt module from https://dl.bintray.com/falcosecurity/driver/2aa88dcf6243982697811df4c1b484bcbe9488a2/falco_cos_4.19.112%2B_1.ko
curl: (22) The requested URL returned error: 404 Not Found
Download failed, consider compiling your own falco module and loading it or getting in touch with the Falco community
Tue Oct 27 15:50:32 2020: Falco version 0.26.1 (driver version 2aa88dcf6243982697811df4c1b484bcbe9488a2)
Tue Oct 27 15:50:32 2020: Falco initialized with configuration file /etc/falco/falco.yaml
Tue Oct 27 15:50:32 2020: Loading rules from file /etc/falco/falco_rules.yaml:
Tue Oct 27 15:50:33 2020: Loading rules from file /etc/falco/falco_rules.local.yaml:
Tue Oct 27 15:50:34 2020: Unable to load the driver.
Tue Oct 27 15:50:34 2020: Runtime error: error opening device /host/dev/falco0. Make sure you have root credentials and that the falco module is loaded.. Exiting

Environment

Additional context

The https://dl.bintray.com/falcosecurity/driver/2aa88dcf6243982697811df4c1b484bcbe9488a2 directory does not contain any kernel headers for COS.

leodido commented 3 years ago

Thanks for reporting this @nhuray !

We're in the process of fixing this up (see falcosecurity/falco#1460)

Also, during tomorrow's community call we'll schedule with the community a hotfix 0.26.2 release of Falco containing such fix.

Please join the call if you want/can! 😊

nhuray commented 3 years ago

Thanks @leodido,

Do we have an ETA for falcosecurity/falco#1460 ? Or could you point the documentation to build the kernel module for COS ?

fntlnz commented 3 years ago

Hi @nhuray ! You can't use the kernel module on COS - the only working way in there is the eBPF probe.

How did you enable eBPF?from what I see in the logs you are using the kernel module.

nhuray commented 3 years ago

Hi @fntlnz

I enabled it using the Helm charts setting value ebpf.enabled: true.

Here is my values.yaml file:

# Default values for Falco.

image:
  registry: docker.io
  repository: falcosecurity/falco
  tag: 0.26.1
  pullPolicy: IfNotPresent
  pullSecrets: []

docker:
  enabled: true
  socket: /var/run/docker.sock

containerd:
  enabled: false
  socket: /run/containerd/containerd.sock

resources:
  # Although resources needed are subjective on the actual workload we provide
  # a sane defaults ones. If you have more questions or concerns, please refer
  # to #falco slack channel for more info about it
  requests:
    cpu: 100m
    memory: 512Mi
  limits:
    cpu: 200m
    memory: 1024Mi

extraArgs: []
nodeSelector: {}
affinity: {}

rbac:
  # Create and use rbac resources
  create: true

podSecurityPolicy:
  # Create a podSecurityPolicy
  create: false

serviceAccount:
  # Create and use serviceAccount resources
  create: true
  # Use this value as serviceAccountName
  name:

fakeEventGenerator:
  enabled: false
  args:
    - run
    - --loop
    - ^syscall
  replicas: 1

daemonset:
  # Perform rolling updates by default in the DaemonSet agent
  # ref: https://kubernetes.io/docs/tasks/manage-daemon/update-daemon-set/
  updateStrategy:
    # You can also customize maxUnavailable, maxSurge or minReadySeconds if you
    # need it
    type: RollingUpdate

  ## Extra environment variables that will be pass onto deployment pods
  env: {}

  ## Add aditional pod annotations on pods created by DaemonSet
  podAnnotations: {}

# If is behind a proxy you can set the proxy server
proxy:
  httpProxy:
  httpsProxy:
  noProxy:

# Set daemonset timezone
timezone:

# Set daemonset priorityClassName
priorityClassName:

ebpf:
  # Enable eBPF support for Falco
  enabled: true # required for GKE: https://falco.org/docs/third-party/#gke

  settings:
    # Needed to enable eBPF JIT at runtime for performance reasons.
    # Can be skipped if eBPF JIT is enabled from outside the container
    hostNetwork: true

auditLog:
  # true here activates the K8s Audit Log feature for Falco
  enabled: false

  dynamicBackend:
    # true here configures an AuditSink who will receive the K8s audit logs
    enabled: false
    # define if auditsink client config should point to a fixed url, not the
    # default webserver service
    url: ""

falco:
  # The location of the rules file(s). This can contain one or more paths to
  # separate rules files.
  rulesFile:
    - /etc/falco/falco_rules.yaml
    - /etc/falco/falco_rules.local.yaml
    - /etc/falco/k8s_audit_rules.yaml
    - /etc/falco/rules.d

  # If true, the times displayed in log messages and output messages
  # will be in ISO 8601. By default, times are displayed in the local
  # time zone, as governed by /etc/localtime.
  timeFormatISO8601: false

  # Whether to output events in json or text
  jsonOutput: true

  # When using json output, whether or not to include the "output" property
  # itself (e.g. "File below a known binary directory opened for writing
  # (user=root ....") in the json output.
  jsonIncludeOutputProperty: true

  # Send information logs to stderr and/or syslog Note these are *not* security
  # notification logs! These are just Falco lifecycle (and possibly error) logs.
  logStderr: true
  logSyslog: false

  # Minimum log level to include in logs. Note: these levels are
  # separate from the priority field of rules. This refers only to the
  # log level of Falco's internal logging. Can be one of "emergency",
  # "alert", "critical", "error", "warning", "notice", "info", "debug".
  logLevel: warning

  # Minimum rule priority level to load and run. All rules having a
  # priority more severe than this level will be loaded/run.  Can be one
  # of "emergency", "alert", "critical", "error", "warning", "notice",
  # "info", "debug".
  priority: debug

  # Whether or not output to any of the output channels below is
  # buffered.
  bufferedOutputs: false

  # Falco uses a shared buffer between the kernel and userspace to pass
  # system call information. When Falco detects that this buffer is
  # full and system calls have been dropped, it can take one or more of
  # the following actions:
  #   - "ignore": do nothing. If an empty list is provided, ignore is assumed.
  #   - "log": log a CRITICAL message noting that the buffer was full.
  #   - "alert": emit a Falco alert noting that the buffer was full.
  #   - "exit": exit Falco with a non-zero rc.
  #
  # The rate at which log/alert messages are emitted is governed by a
  # token bucket. The rate corresponds to one message every 30 seconds
  # with a burst of 10 messages.
  syscallEventDrops:
    actions:
      - log
      - alert
    rate: .03333
    maxBurst: 10

  # A throttling mechanism implemented as a token bucket limits the
  # rate of Falco notifications. This throttling is controlled by the following configuration
  # options:
  #  - rate: the number of tokens (i.e. right to send a notification)
  #    gained per second. Defaults to 1.
  #  - max_burst: the maximum number of tokens outstanding. Defaults to 1000.
  #
  # With these defaults, Falco could send up to 1000 notifications after
  # an initial quiet period, and then up to 1 notification per second
  # afterward. It would gain the full burst back after 1000 seconds of
  # no activity.
  outputs:
    rate: 1
    maxBurst: 1000

  # Where security notifications should go.
  # Multiple outputs can be enabled.
  syslogOutput:
    enabled: false

  # If keep_alive is set to true, the file will be opened once and
  # continuously written to, with each output message on its own
  # line. If keep_alive is set to false, the file will be re-opened
  # for each output message.
  #
  # Also, the file will be closed and reopened if Falco is signaled with
  # SIGUSR1.
  fileOutput:
    enabled: false
    keepAlive: false
    filename: ./events.txt

  stdoutOutput:
    enabled: true

  # Falco contains an embedded webserver that can be used to accept K8s
  # Audit Events. These config options control the behavior of that
  # webserver. (By default, the webserver is enabled).
  webserver:
    enabled: true
    listenPort: 8765
    k8sAuditEndpoint: /k8s-audit

  # Possible additional things you might want to do with program output:
  #   - send to a slack webhook:
  #     program: "\"jq '{text: .output}' | curl -d @- -X POST https://hooks.slack.com/services/XXX\""
  #   - logging (alternate method than syslog):
  #     program: logger -t falco-test
  #   - send over a network connection:
  #     program: nc host.example.com 80

  # If keep_alive is set to true, the program will be started once and
  # continuously written to, with each output message on its own
  # line. If keep_alive is set to false, the program will be re-spawned
  # for each output message.
  #
  # Also, the program will be closed and reopened if Falco is signaled with
  # SIGUSR1.
  programOutput:
    enabled: false
    keepAlive: false
    program: mail -s "Falco Notification" someone@example.com
    # program: |
    #   jq 'if .priority == "Emergency" or .priority == "Critical" or .priority == "Error" then
    #     { attachments: [{ text: .output, color: "danger" }]}
    #   elif .priority == "Warning" or .priority == "Notice" then
    #     { attachments: [{ text: .output, color: "warning" }]}
    #   elif .priority == "Informational" then
    #     { attachments: [{ text: .output, color: "good" }]}
    #   else
    #     { attachments: [{ text: .output }]}
    #   end' | curl -d @- -X POST https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxxx/xxxxxxxxxxxxxxxxxxxxxxxx

  httpOutput:
    enabled: false
    url: http://some.url

  # Falco supports running a gRPC server with two main binding types
  # 1. Over the network with mandatory mutual TLS authentication (mTLS)
  # 2. Over a local unix socket with no authentication
  # By default, the gRPC server is disabled, with no enabled services (see grpc_output)
  # please comment/uncomment and change accordingly the options below to configure it.
  # Important note: if Falco has any troubles creating the gRPC server
  # this information will be logged, however the main Falco daemon will not be stopped.
  # gRPC server over network with (mandatory) mutual TLS configuration.
  # This gRPC server is secure by default so you need to generate certificates and update their paths here.
  # By default the gRPC server is off.
  # You can configure the address to bind and expose it.
  # By modifying the threadiness configuration you can fine-tune the number of threads (and context) it will use.
  grpc:
    enabled: false
    threadiness: 0

    # gRPC unix socket with no authentication
    unixSocketPath: "unix:///var/run/falco/falco.sock"

    # gRPC over the network (mTLS) / required when unixSocketPath is empty
    listenPort: 5060
    privateKey: "/etc/falco/certs/server.key"
    certChain: "/etc/falco/certs/server.crt"
    rootCerts: "/etc/falco/certs/ca.crt"

  # gRPC output service.
  # By default it is off.
  # By enabling this all the output events will be kept in memory until you read them with a gRPC client.
  # Make sure to have a consumer for them or leave this disabled.
  grpcOutput:
    enabled: false

customRules: {}
  # Although Falco comes with a nice default rule set for detecting weird
  # behavior in containers, our users are going to customize the run-time
  # security rule sets or policies for the specific container images and
  # applications they run. This feature can be handled in this section.
  #
  # Example:
  #
  # rules-traefik.yaml: |-
#   [ rule body ]

integrations:
  # If Google Cloud Security Command Center integration is enabled, Falco will
  # be configured to use this integration as program_output and sets the following values:
  # * json_output: true
  # * program_output:
  #     enabled: true
  #     keep_alive: false
  #     program: "curl -d @- -X POST --header 'Content-Type: application/json' --header 'Authorization: authentication_token' url"
  gcscc:
    enabled: false
    webhookUrl: http://sysdig-gcscc-connector.default.svc.cluster.local:8080/events
    webhookAuthenticationToken: b27511f86e911f20b9e0f9c8104b4ec4
  # If Nats Output integration is enabled, Falco will be configured to use this
  # integration as file_output and sets the following values:
  # * json_output: true
  # * json_include_output_property: true
  # * file_output:
  #     enabled: true
  #     keep_alive: true
  #     filename: /tmp/shared-pipe/nats
  natsOutput:
    enabled: false
    natsUrl: "nats://nats.nats-io.svc.cluster.local:4222"
  # If SNS Output integration is enabled, Falco will be configured to use this
  # integration as file_output and sets the following values:
  # * json_output: true
  # * json_include_output_property: true
  # * file_output:
  #     enabled: true
  #     keep_alive: true
  #     filename: /tmp/shared-pipe/nats
  snsOutput:
    enabled: false
    topic: ""
    aws_access_key_id: ""
    aws_secret_access_key: ""
    aws_default_region: ""

  # If GCloud Pub/Sub integration is enabled, Falco will be configured to use this
  # integration as file_output and sets the following values:
  # * json_output: true
  # * json_include_output_property: true
  # * file_output:
  #     enabled: true
  #     keep_alive: true
  #     filename: /tmp/shared-pipe/nats
  pubsubOutput:
    enabled: false
    topic: ""
    credentialsData: ""
    projectID: ""

# Allow Falco to run on Kubernetes 1.6 masters.
tolerations:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master

scc:
  # true here enabled creation of Security Context Constraints in Openshift
  create: true

# Add initContainers to Falco pod
extraInitContainers: []
leodido commented 3 years ago

Right, my mistake!

On COS you can only use eBPF probes, thus PR 1460 is not a fix!

My apologieeees.


Anyways this issue seems more related to Falco Helm charts (on COS), thus it belongs to https://github.com/falcosecurity/charts, the repository to which I'm moving the issue.

In the meantime could you simply try this?

helm install falco falcosecurity/falco -n falco --set ebpf.enabled=true
nhuray commented 3 years ago

Thanks @leodido

For some reasons running directly helm with the command you gave to me works:

* Setting up /usr/src links from host
* Running falco-driver-loader with: driver=bpf, compile=yes, download=yes
* Mounting debugfs
* Found kernel config at /proc/config.gz
* COS detected (build 12371.1079.0), using cos kernel headers
* Downloading https://storage.googleapis.com/cos-tools/12371.1079.0/kernel-headers.tgz
* Extracting kernel sources
* Configuring kernel
* Trying to compile the eBPF probe (falco_cos_4.19.112+_1.o)
* Skipping download, eBPF probe is already present in /root/.falco/falco_cos_4.19.112+_1.o
* eBPF probe located in /root/.falco/falco_cos_4.19.112+_1.o
* Success: eBPF probe symlinked to /root/.falco/falco-bpf.o
Wed Oct 28 14:00:17 2020: Falco version 0.26.1 (driver version 2aa88dcf6243982697811df4c1b484bcbe9488a2)
Wed Oct 28 14:00:17 2020: Falco initialized with configuration file /etc/falco/falco.yaml
Wed Oct 28 14:00:17 2020: Loading rules from file /etc/falco/falco_rules.yaml:
Wed Oct 28 14:00:18 2020: Loading rules from file /etc/falco/falco_rules.local.yaml:

I have to investigate why it fails when I edit the values.yaml ...

I let you know during the day if I found the issue if it's a mistake on my side or a real bug !

Thanks again

nhuray commented 3 years ago

@leodido I found my mistake.

Actually I'm deploying Falco using the helm chart and ArgoCD.

Because previous version of ArgoCD was not creating a default namespace the common practice was to create a wrapper Helm chart. So a directory structure like that:

.
├── Chart.yaml
├── charts
│   └── falco-1.5.2.tgz
├── requirements.lock
├── requirements.yaml
├── templates
│   └── namespace.yaml
└── values.yaml

Doing that the values.yaml needs to define the full path to set up the chart in dependency.

So instead of doing:

ebfp:
   enabled: true

I have to do:

falco:  # the helm chart dependency name
   ebpf:
      enabled: true

So it was a stupid mistake so you can close that ticket. Thanks for your help @leodido and @fntlnz !

leogr commented 3 years ago

@nhuray Thanks for the explanation!

Happy to see you solved that.

/close

poiana commented 3 years ago

@leogr: Closing this issue.

In response to [this](https://github.com/falcosecurity/charts/issues/134#issuecomment-718785347): >@nhuray Thanks for the explanation! > >Happy to see you solved that. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.