elastic / cloud-on-k8s

Elastic Cloud on Kubernetes
Other
2.59k stars 704 forks source link

x509: certificate signed by unknown authority #8114

Open kaykhan opened 4 days ago

kaykhan commented 4 days ago

I am using ECK with fleet and agents. I have setup the Kibana Agent and set the host, username and password and left the certificate entry empty.

However i get the following certificate error in the agent logs

Image

{"log.level":"error","@timestamp":"2024-10-17T08:30:37.777Z","message":"Error fetching data for metricset kibana.cluster_actions: error making http request: Get \"https://kibana-prod-eck-kibana-kb-http.elastic-system.svc:5601/api/status\": x509: certificate signed by unknown authority","component":{"binary":"metricbeat","dataset":"elastic_agent.metricbeat","id":"kibana/metrics-default","type":"kibana/metrics"},"log":{"source":"kibana/metrics-default"},"log.origin":{"file.line":256,"file.name":"module/wrapper.go","function":"github.com/elastic/beats/v7/metricbeat/mb/module.(*metricSetWrapper).fetch"},"service.name":"metricbeat","ecs.version":"1.6.0","ecs.version":"1.6.0"}I

As far as i understand ECK should use self signed certificates.

kibana config

    version: 8.15.2
    spec:
      count: 1
      elasticsearchRef:
        name: elasticsearch-prod-eck-elasticsearch
      http:
        service:
          spec:
            type: NodePort
      podTemplate:
        metadata:
          labels:
            scrape: kb
        spec:
          containers:
          - name: kibana
            resources:
              limits:
                memory: 2Gi
                cpu: 1
          tolerations:
            - key: "karpenter/elastic"
              operator: "Exists"
              effect: "NoSchedule"
          nodeSelector:
            karpenter-node-pool: elastic
            karpenter.k8s.aws/instance-size: large
      config:
        xpack.encryptedSavedObjects:
            encryptionKey: wmtk5CT8qsIn31WSXmd0zPvUrDSvezpJF5gHq4c+cDNbOVJDXHmMBl+537PdUHLx
        xpack.fleet.agents.elasticsearch.hosts: ["https://elasticsearch-prod-eck-elasticsearch-es-http.elastic-system.svc:9200"]
        xpack.fleet.agents.fleet_server.hosts: ["https://fleet-server-prod-eck-fleet-server-agent-http.elastic-system.svc:8220"]
        xpack.fleet.packages:
          - name: elastic_agent
            version: latest
          - name: fleet_server
            version: latest
          - name: kibana
            version: latest
          - name: kubernetes
            version: latest
          - name: system
            version: latest
          - name: apm
            version: latest
          - name: elasticsearch
            version: latest
        xpack.fleet.agentPolicies:
          - name: Fleet Server on ECK policy
            id: eck-fleet-server
            namespace: default
            monitoring_enabled:
              - logs
              - metrics
            unenroll_timeout: 900
            package_policies:
            - name: fleet_server-1
              id: fleet_server-1
              package:
                name: fleet_server
          - name: Elastic Agent on ECK policy
            id: eck-agent
            namespace: default
            monitoring_enabled:
              - logs
              - metrics
            unenroll_timeout: 900
            is_default: true
            package_policies: 
              - id: kibana-1
                name: kibana-1
                package:
                  name: kibana
              - name: kubernetes-1
                id: kubernetes-1
                package:
                  name: kubernetes
              - id: system-1
                name: system-1
                package:
                  name: system
              - id: apm-1
                name: apm-1
                package:
                  name: apm
              - id: elasticsearch-1
                name: elasticsearch-1
                package:
                  name: elasticsearch

Image

barkbay commented 4 days ago

IIUC you want to setup stack monitoring, any reason for not using the built-in feature: https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-stack-monitoring.html ?

kaykhan commented 4 days ago

IIUC you want to setup stack monitoring, any reason for not using the built-in feature: https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-stack-monitoring.html ?

Built-in feature is self monitoring IIUC?

Since ive been using ECK ive always seen that it is recommended to use metricbeat, we are moving away from metricbeat to elastic agents.

I'm happy to use whatever is recommended.

Are you able to re confirm what is recommended for stack monitoring of Elasticsearch, kibana & fleet+agents?

barkbay commented 4 days ago

Built-in feature is self monitoring IIUC?

You can have a dedicated monitoring cluster, from the documentation:

To enable Stack Monitoring, simply reference the monitoring Elasticsearch cluster in the spec.monitoring section of their specification.

kaykhan commented 4 days ago

Built-in feature is self monitoring IIUC?

You can have a dedicated monitoring cluster, from the documentation:

To enable Stack Monitoring, simply reference the monitoring Elasticsearch cluster in the spec.monitoring section of their specification.

Sure, so are you saying we should not be using the elastic agent Kibana and Elasticsearch integrations to monitor our stack?

barkbay commented 4 days ago

Sure, so are you saying we should not be using the elastic agent Kibana and Elasticsearch integrations to monitor our stack?

As long as your resources are managed by the same ECK instance, and unless you have a specific reason not to do so (which was the reason for my first question) I would say no. Otherwise I believe you have to manage the user and the certificate management manually.

kaykhan commented 4 days ago

Sure, so are you saying we should not be using the elastic agent Kibana and Elasticsearch integrations to monitor our stack?

As long as your resources are managed by the same ECK instance, and unless you have a specific reason not to do so (which was the reason for my first question) I would say no. Otherwise I believe you have to manage the user and the certificate management manually.

Image

barkbay commented 4 days ago

i noticed on the documentation it does not show how to monitor elastic agents - https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-stack-monitoring.html. Is that possible?

Agent monitoring should be enabled by default: https://www.elastic.co/guide/en/fleet/current/monitor-elastic-agent.html

Edit: just realized that this should also enable monitoring: https://github.com/elastic/cloud-on-k8s/blob/613f3a725a93c99f343406a28c8b4c0eea2600a6/config/recipes/elastic-agent/fleet-kubernetes-integration.yaml#L26-L28

Is it possible for us to manage the Index Template and ILM policy so we can determine the routing allocation of logs/metrics AND the delete phase. I would prefer to be able to codify these changes and not have to make then manually in the UI. We like to store the stack monitoring logs/metrics on a seperate nodeSet called "monitoring"

Unfortunately I don't think this is possible, only the monitoring pod template can be configured, not the configuration (cc @thbkrkr to keep me honest).

i was able to resolve the initial problem by setting ssl.verification_mode: "none" although im not entirely sure the implication of this - could you help me with that?

To fully trust the Kibana cert I think you need to manually mount the Secret that holds the CA and set the path inside the Agent Pod. I don't think we have a properly documented way to do that though.

kaykhan commented 4 days ago

Unfortunately I don't think this is possible, only the monitoring pod template can be configured, not the configuration (cc @thbkrkr to keep me honest).

Okay thats unfortunate and i remember now that was one of the main reasons we moved away from self monitoring to using metricbeat (2 years ago). I can see the metricbeat configuration i have for my existing cluster, you can see it allows us to set the ILM and template settings.

setup.ilm:
  enabled: true
  policy_name: metricbeat-custom
  policy_file: /etc/indice-lifecycle.json
  overwrite: true
setup.template.settings:
  index:
    routing.allocation.require.type: "monitoring"

I'm currently working on a project to create a new ECK cluster where we plan to use Elastic Agents. I hope to modify the template settings and lifecycle policy, but I still need to research how to do this. Do you know if it's possible? If not, this would mark our second year attempting to migrate from Metricbeat/Filebeat to Elastic Agents without success, this functionality is super important for us.

We will also be using the https://github.com/elastic/terraform-provider-elasticstack to manage our elastic agent / fleet policies and integrations.

To fully trust the Kibana cert I think you need to manually mount the Secret that holds the CA and set the path inside the Agent Pod. I don't think we have a properly documented way to do that though.

Okay, until that documentation is outlined i'm going to see how far ssl.verification_mode: "none" gets me