VictoriaMetrics / helm-charts

Helm charts for VictoriaMetrics, VictoriaLogs and ecosystem
https://victoriametrics.github.io/helm-charts/
Apache License 2.0
329 stars 322 forks source link

[victoria-metrics-k8s-stack] 0.24.5 node-exporter #1307

Open Dofamin opened 2 weeks ago

Dofamin commented 2 weeks ago

When installing chart, victoria-metrics-k8s-stack-prometheus-node-exporter wont startup because of the problem it's security restriction on cloud part, okay i get it

Error creating: admission webhook "validation.gatekeeper.sh" denied the request: [k8spsphostnamespace] Sharing the host namespace is not allowed: victoria-metrics-k8s-stack-prometheus-node-exporter-nxm4h, please follow the article https://vk.cc/cfc8TH to get more information

i managed to fix this problem with this values for node-exporter

prometheus-node-exporter:
  enabled: true
  affinity: {}
  configmaps: []
  containerSecurityContext:
    readOnlyRootFilesystem: true
  daemonsetAnnotations: {}
  dnsConfig: {}
  endpoints: []
  env: {}
  extraArgs:
    - --collector.filesystem.mount-points-exclude=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)
    - --collector.filesystem.fs-types-exclude=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
  extraHostVolumeMounts: []
  extraInitContainers: []
  extraManifests: []
  fullnameOverride: ""
  global:
    imagePullSecrets: []
    imageRegistry: ""
    rbac:
      create: true
      createAggregateClusterRoles: false
      pspAnnotations: {}
      pspEnabled: false
  hostNetwork: true
  hostPID: true
  hostRootFsMount:
    enabled: true
    mountPropagation: HostToContainer
  image:
    digest: ""
    pullPolicy: IfNotPresent
    registry: quay.io
    repository: prometheus/node-exporter
    tag: v1.7.0
  imagePullSecrets: []
  kubeRBACProxy:
    containerSecurityContext: {}
    enabled: false
    extraArgs: []
    image:
      pullPolicy: IfNotPresent
      registry: quay.io
      repository: brancz/kube-rbac-proxy
      sha: ""
      tag: v0.15.0
    resources: {}
  livenessProbe:
    failureThreshold: 3
    httpGet:
      httpHeaders: []
      scheme: http
    initialDelaySeconds: 0
    periodSeconds: 10
    successThreshold: 1
    timeoutSeconds: 1
  nameOverride: ""
  namespaceOverride: kube-system
  networkPolicy:
    enabled: false
  nodeSelector:
    kubernetes.io/os: linux
  podAnnotations:
    cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
  podLabels:
    jobLabel: node-exporter
  prometheus:
    monitor:
      additionalLabels: {}
      apiVersion: ""
      attachMetadata:
        node: false
      basicAuth: {}
      bearerTokenFile: null
      enabled: true
      interval: ""
      jobLabel: jobLabel
      labelLimit: 0
      labelNameLengthLimit: 0
      labelValueLengthLimit: 0
      metricRelabelings: []
      namespace: ""
      podTargetLabels: []
      proxyUrl: ""
      relabelings: []
      sampleLimit: 0
      scheme: http
      scrapeTimeout: ""
      selectorOverride: {}
      targetLimit: 0
      tlsConfig: {}
    podMonitor:
      additionalLabels: {}
      apiVersion: ""
      attachMetadata:
        node: false
      authorization: {}
      basicAuth: {}
      bearerTokenSecret: {}
      enableHttp2: ""
      enabled: false
      filterRunning: ""
      followRedirects: ""
      honorLabels: true
      honorTimestamps: true
      interval: ""
      jobLabel: ""
      labelLimit: 0
      labelNameLengthLimit: 0
      labelValueLengthLimit: 0
      metricRelabelings: []
      namespace: ""
      oauth2: {}
      params: {}
      path: /metrics
      podTargetLabels: []
      proxyUrl: ""
      relabelings: []
      sampleLimit: 0
      scheme: http
      scrapeTimeout: ""
      selectorOverride: {}
      targetLimit: 0
      tlsConfig: {}
  rbac:
    create: true
    pspAnnotations: {}
    pspEnabled: false
  readinessProbe:
    failureThreshold: 3
    httpGet:
      httpHeaders: []
      scheme: http
    initialDelaySeconds: 0
    periodSeconds: 10
    successThreshold: 1
    timeoutSeconds: 1
  releaseLabel: true
  resources:
    limits:
      cpu: 200m
      memory: 50Mi
    requests:
      cpu: 100m
      memory: 30Mi
  revisionHistoryLimit: 10
  secrets: []
  securityContext:
    fsGroup: 65534
    runAsGroup: 65534
    runAsNonRoot: true
    runAsUser: 65534
  service:
    annotations:
      prometheus.io/scrape: "true"
    enabled: true
    ipDualStack:
      enabled: false
      ipFamilies:
        - IPv6
        - IPv4
      ipFamilyPolicy: PreferDualStack
    listenOnAllInterfaces: true
    nodePort: null
    port: 9100
    portName: http-metrics
    targetPort: 9100
    type: ClusterIP
  serviceAccount:
    annotations: {}
    automountServiceAccountToken: false
    create: true
    imagePullSecrets: []
    name: null
  sidecarHostVolumeMounts: []
  sidecarVolumeMount: []
  sidecars: []
  tolerations:
    - effect: NoSchedule
      operator: Exists
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate
  verticalPodAutoscaler:
    controlledResources: []
    enabled: false
    maxAllowed: {}
    minAllowed: {}

  vmServiceScrape:
    # whether we should create a service scrape resource for node-exporter
    enabled: true

    # spec for VMServiceScrape crd
    # https://docs.victoriametrics.com/operator/api.html#vmservicescrapespec
    spec:
      jobLabel: jobLabel
      endpoints:
        - port: metrics
          metricRelabelConfigs:
            - action: drop
              source_labels: [mountpoint]
              regex: "/var/lib/kubelet/pods.+"
AndrewChubatiuk commented 2 weeks ago

could you please send a diff with values that you've changed?

Dofamin commented 2 weeks ago

i belive actualy usefull part is only namespaceOverride: kube-system

Dofamin commented 2 weeks ago

on next week i get up new cluster to test namespaceOverride: kube-system and show 1 more strange behavior when deleting for issue #1308, i forgot to document her

AndrewChubatiuk commented 1 week ago

looks like it's a custom gatekeeper restriction, if it's allowed to deploy node exporter without problems in kube-system namespace. have you installed and configured gatekeeper or was it automatically provided with a cluster by a cloud provider?