falcosecurity / falco-exporter

Prometheus Metrics Exporter for Falco output events
Apache License 2.0
118 stars 33 forks source link

Falco metrics do not populate in Prometheus with falco-exporter #63

Closed revaniki closed 3 years ago

revaniki commented 3 years ago

Describe the bug

I've partnered with Udacity to create a course on microservices security. Falco is front and center per the good work of our community :wink:

Working through demos for the very last lesson in the course on runtime monitoring and incident response. Intent is to teach students how to use falco for runtime monitoring and incident response. Roughly following Leo's awesome blog- https://falco.org/blog/falco-kind-prometheus-grafana/#install-prometheus

After much debugging, I'm not seeing the falco metrics in Prometheus and subsequently not in Grafana. I'm under a huge time crunch, we need to ship the course, this is the last technical blocker. I would greatly appreciate your help folks. Hard blocked on completion and created alot of content for this already.

How to reproduce

To repro:

  1. Create a two node (node1) RKE cluster via Vagrantfile and cluster.yaml

  2. SSH into node1 and node2 and install kernel drivers for falco

rpm --import https://falco.org/repo/falcosecurity-3672BA8F.asc curl -s -o /etc/zypp/repos.d/falcosecurity.repo https://falco.org/repo/falcosecurity-rpm.repo Install kernel headers: zypper -n install kernel-default-devel

  1. Generate certs manually and Deploy falco as a daemonset. note: Certs are generated manually and stored in certs/ directory on the machine I'm running helm from
helm install falco falcosecurity/falco \
  --set falco.grpc.enabled=true \
  --set falco.grpcOutput.enabled=true \
  --set-file certs.server.key=certs/server.key \
  --set-file certs.server.crt=certs/server.crt \
  --set-file certs.ca.crt=certs/ca.crt \
  1. Deploy falco-exporter via falco-exporter helm chart with reference certs/
helm install falco-exporter \
    --set-file certs.ca.crt=certs/ca.crt,certs.client.key=certs/client.key,certs.client.crt=certs/client.crt \
    falcosecurity/falco-exporter
  1. Deploy Prometheus via kube-prometheus-stack helm chart

helm install prometheus prometheus-community/kube-prometheus-stack

  1. All relevant pods, everything is up separately, falco generates logs
kubectl get pods                               
NAME                                                     READY   STATUS    RESTARTS   AGE
alertmanager-prometheus-kube-prometheus-alertmanager-0   2/2     Running   0          105m
falco-86mzz                                              1/1     Running   1          3d4h
falco-exporter-jq869                                     1/1     Running   51         3d1h
prometheus-grafana-66c946f558-7j9hq                      2/2     Running   0          106m
prometheus-kube-prometheus-operator-779574c749-gb2h8     1/1     Running   0          106m
prometheus-kube-state-metrics-685b975bb7-xbbw5           1/1     Running   0          106m
prometheus-prometheus-0                                  2/2     Running   1          48m
prometheus-prometheus-1                                  2/2     Running   1          48m
prometheus-prometheus-kube-prometheus-prometheus-0       2/2     Running   1          105m
prometheus-prometheus-node-exporter-vxwst                1/1     Running   9          106m

Port forward all relevant pods:

kubectl --namespace default falco-exporter-jq869 9376

kubectl --namespace default port-forward prometheus-grafana-66c946f558-7j9hq 3000

kubectl --namespace default port-forward prometheus-prometheus-kube-prometheus-prometheus-0 9090

  1. I see falco metrics in the falco-exporter endpoint http://127.0.0.1:9376/metrics
# HELP falco_events 
# TYPE falco_events counter
falco_events{hostname="falco-86mzz",k8s_ns_name="<NA>",k8s_pod_name="<NA>",priority="4",rule="Mount Launched in Privileged Container",source="SYSCALL"} 1
  1. Now configure an additional scraper for falco-exporter. Create prometheus-additional.yaml file
apiVersion: v1
kind: Secret
metadata:
  name: additional-scrape-configs
  namespace: default
stringData:
  prometheus-additional.yaml: 
    - job_name: falco_exporter 
      static_configs: 
      metrics_path: /metrics
        - targets: ["falco-exporter:9376"] 
type: Opaque
  1. Use that file to run this command and generate a new file with secret additional-scrape-configs.yaml

kubectl create secret generic additional-scrape-configs --from-file=prometheus-additional.yaml > additional-scrape-configs.yaml

  1. Apply the additional-scrape-configs.yaml via kubectl apply -f additional-scrape-configs.yaml

  2. Create custom ServiceMonitor file falco_service_monitor.yaml and applykubectl apply -f falco_service_monitor.yaml`

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: falco-exporter
  labels:
    release: prometheus
    app: falco-exporter
spec:
  endpoints:
  - port: '9376'
    path: '/metrics'
  namespaceSelector:
    any: true
  selector:
    matchLabels:
      app: falco-exporter
      release: prometheus
  1. From grafana, import falco grafana panel -> https://grafana.com/grafana/dashboards/11914

grafana dashboard is empty

Screen Shot 2021-04-25 at 7 04 39 PM

The falco-exporter ServiceMonitor populates in Prometheus but no metrics.

prometheus service discovery ok for falco-exporter

Screen Shot 2021-04-25 at 7 05 33 PM

prometheus metric source, no falco_events

Screen Shot 2021-04-25 at 7 07 13 PM

Expected behaviour

Expect to see falco_events in Prometheus under and Grafana

Screenshots

falco-exporter

Screen Shot 2021-04-25 at 7 04 51 PM

prometheus service discovery

Screen Shot 2021-04-25 at 7 05 33 PM

prometheus metric source

Screen Shot 2021-04-25 at 7 07 13 PM

grafana dashboard

Screen Shot 2021-04-25 at 7 04 39 PM

Environment

RKE

docker.io/falcosecurity/falco:0.28.0

Vagrantbox running openSUSE Leap hosted on macOS Catalina

NAME="openSUSE Leap" VERSION="15.2" ID="opensuse-leap" ID_LIKE="suse opensuse" VERSION_ID="15.2" PRETTY_NAME="openSUSE Leap 15.2" ANSI_COLOR="0;32" CPE_NAME="cpe:/o:opensuse:leap:15.2" BUG_REPORT_URL="https://bugs.opensuse.org" HOME_URL="https://www.opensuse.org/"

Linux localhost 5.3.18-lp152.72-default falcosecurity/falco#1 SMP Wed Apr 14 10:13:15 UTC 2021 (013936d) x86_64 x86_64 x86_64 GNU/Linux

See above

Additional context

leogr commented 3 years ago

Hi

Thank you for reporting this! I'm linking the original discussion on Slack as references:

Moreover, it seems to me this could be an issue related to how the falco-exporter is deployed by its helm chart. If it were confirmed, I would move this issue to the charts repository.

Finally, just a question: Instead of manually creating the service monitor, have you tried to use the chart's option intended for that (i.e. serviceMonitor.enabled)?

Thanks

revaniki commented 3 years ago

thanks @leogr for your quick reply, I tried to redeploy falco-exporter with the --set serviceMonitor.enabled=true switch, with no luck. Also tried to apply a custom ServiceMonitor config with no luck.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: falco-exporter
  labels:
    release: prometheus
    app: falco-exporter
spec:
  endpoints:
  - port: '9376'
    path: '/metrics'
  namespaceSelector:
    any: true
  selector:
    matchLabels:
      app: falco-exporter
      release: prometheus 

I will destroy the cluster and start over.

revaniki commented 3 years ago

Hi @leogr , I rebuilt the cluster following the steps verbatim and don't see falco-exporter under targets and service discovery in prom on http://127.0.0.1:9090. I'm not sure how to further troubleshoot at the moment. Trying to get help from folks with depth on Prom.

leogr commented 3 years ago

Can you confirm this has been solved with the solution described by this comment https://kubernetes.slack.com/archives/CMWH3EH32/p1619753500281500?thread_ts=1619144955.110600&cid=CMWH3EH32 ?

leodido commented 3 years ago

It seems to me this issue belongs to the falco-exporter repository. Moving it there

leogr commented 3 years ago

Assuming this issue has been solved as per this discussion :point_down: https://kubernetes.slack.com/archives/CMWH3EH32/p1619753500281500?thread_ts=1619144955.110600&cid=CMWH3EH32

/close

poiana commented 3 years ago

@leogr: Closing this issue.

In response to [this](https://github.com/falcosecurity/falco-exporter/issues/63#issuecomment-870394599): >Assuming this issue has been solved as per this discussion :point_down: >https://kubernetes.slack.com/archives/CMWH3EH32/p1619753500281500?thread_ts=1619144955.110600&cid=CMWH3EH32 > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
Neneil94 commented 2 years ago

Assuming this issue has been solved as per this discussion 👇 https://kubernetes.slack.com/archives/CMWH3EH32/p1619753500281500?thread_ts=1619144955.110600&cid=CMWH3EH32

/close

Is it possible to publish this solution in here? I've no access to this slack channel. Looks like they restricted it.

BastienBNG commented 2 years ago

Yes me too i have no access to this slack channel. Is it possible to publish the solution here ?

leogr commented 2 years ago

Hey @Neneil94 and @BastienBNG

Unfortunately, that discussion is very long, and it's difficult to share here. Anyhow, the slack channel is not restricted. To access the discussion, you have to:

I hope I've been of some help :)