fluxcd / flux2-monitoring-example

Prometheus monitoring for the Flux control plane
https://fluxcd.io/flux/monitoring/
Apache License 2.0
50 stars 133 forks source link

Fix installation on talos.dev #25

Closed gecube closed 7 months ago

gecube commented 7 months ago

When installing on talos.dev the prometheus is not running. The issue is using of PodSecurityConfiguration There some labels must be present on NS monitoring I fixed it I am kindly asking to accept this PR.

kingdonb commented 7 months ago

I am not a talos user today, but I may be one tomorrow... I am interested in checking this out and understanding if we support talos appropriately today, thanks for reporting this @gecube 🥇

gecube commented 7 months ago

@kingdonb any chance that it would be accepted? I don't like hanging PRs and stale branches

kingdonb commented 7 months ago

@gecube Yes we discussed this at Bug Scrub yesterday, but I didn't get around to updating the issue here.

I think we should expand support for talos, and I'd like to begin testing it myself. Immediately!

It is going to take me at least one more day to get my local dev environment up. But if we have one more Talos user here who can chime in and commit to report issues like this when we spot them, who can validate this change makes sense, I'd be glad to merge it.

Only problem is I do not have write access here. @fluxcd/maintainers Do we have a policy about write access to example repos? I think maybe they would fall under website/community and I should have access already. Or the example repos ought to have a MAINTAINERS file of their own, and I'll apply to be maintainer for the various docs repos.

I don't think I should be core maintainer, I don't have the golang experience to merge PRs in any old repo, but I can help in any of these example repos (and I'd volunteer for this.)

kingdonb commented 7 months ago

We're testing today in Bug Scrub:

  Warning  FailedCreate  7m28s                 daemonset-controller  Error creating: pods "kube-prometheus-stack-prometheus-node-exporter-lxl8c" is forbidden: violates PodSecurity "baseline:latest": host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volumes "proc", "sys", "root"), hostPort (container "node-exporter" uses hostPort 9100)
  Warning  FailedCreate  2m1s (x8 over 7m26s)  daemonset-controller  (combined from similar events): Error creating: pods "kube-prometheus-stack-prometheus-node-exporter-j25ff" is forbidden: violates PodSecurity "baseline:latest": host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volumes "proc", "sys", "root"), hostPort (container "node-exporter" uses hostPort 9100)

This is the daemonset which is not getting any pods fulfilled, preventing the HelmRelease from suceeding. All of the non daemonset pods are fine.

The talos pod security docs mention only the one label:

https://www.talos.dev/v1.6/kubernetes-guides/configuration/pod-security/

I've applied that one label and it does allow the HelmRelease to complete successfully, in my testing

gecube commented 7 months ago

@kingdonb @stefanprodan thanks for your comments and testing. Fixed.