Unable to connect to a secure ES cluster

Butters646 commented 5 years ago

Is this a request for help?: Yes

Version of Helm and Kubernetes:

Client: &version.Version{SemVer:"v2.11.0", GitCommit:"2e55dbe1fdb5fdb96b75ff144a339489417b146b", GitTreeState:"clean"} Server: &version.Version{SemVer:"v2.11.0", GitCommit:"2e55dbe1fdb5fdb96b75ff144a339489417b146b", GitTreeState:"clean"}

Which chart in which version:

REVISION: 1
RELEASED: Mon Jun 10 11:04:13 2019
CHART: fluentd-elasticsearch-3.0.1
USER-SUPPLIED VALUES:
elasticsearch:
  auth:
    enabled: false
    password: xxx
    user: elastic
  buffer_chunk_limit: 2M
  buffer_queue_limit: 8
  host: elasticsearch-master
  logstash_prefix: logstash
  port: 9200
  scheme: https
  ssl_version: TLSv1_2
env: null
extraConfigMaps: null
extraVolumeMounts:
- mountPath: /certs
  name: es-certs
  readOnly: true
extraVolumes:
- name: es-certs
  secret:
    defaultMode: 420
    secretName: elastic-certificate-pem
fluentdArgs: --no-supervisor -q
rbac:
  create: true
secret:
- name: ELASTICSEARCH_PASSWORD
  secret_key: password
  secret_name: elastic-credentials
serviceAccount:
  create: true
  name: ""

COMPUTED VALUES:
affinity: {}
annotations: {}
awsSigningSidecar:
  enabled: false
  image:
    repository: abutaha/aws-es-proxy
    tag: 0.9
configMaps:
  useDefaults:
    containersInputConf: true
    forwardInputConf: true
    monitoringConf: true
    outputConf: true
    systemConf: true
    systemInputConf: true
elasticsearch:
  auth:
    enabled: false
    password: xxx
    user: elastic
  buffer_chunk_limit: 2M
  buffer_queue_limit: 8
  host: elasticsearch-master
  logstash_prefix: logstash
  port: 9200
  scheme: https
  ssl_version: TLSv1_2
extraVolumeMounts:
- mountPath: /certs
  name: es-certs
  readOnly: true
extraVolumes:
- name: es-certs
  secret:
    defaultMode: 420
    secretName: elastic-certificate-pem
fluentdArgs: --no-supervisor -q
hostLogDir:
  dockerContainers: /var/lib/docker/containers
  libSystemdDir: /usr/lib64
  varLog: /var/log
image:
  pullPolicy: IfNotPresent
  repository: gcr.io/fluentd-elasticsearch/fluentd
  tag: v2.5.2
livenessProbe:
  enabled: true
nodeSelector: {}
podAnnotations: {}
podSecurityPolicy:
  annotations: {}
  enabled: false
priorityClassName: ""
prometheusRule:
  enabled: false
  labels: {}
  prometheusNamespace: monitoring
rbac:
  create: true
resources: {}
secret:
- name: ELASTICSEARCH_PASSWORD
  secret_key: password
  secret_name: elastic-credentials
service: {}
serviceAccount:
  create: true
  name: ""
serviceMonitor:
  enabled: false
  interval: 10s
  labels: {}
  path: /metrics
tolerations: {}
updateStrategy:
  type: RollingUpdate

HOOKS:
MANIFEST:

---
# Source: fluentd-elasticsearch/templates/configmaps.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-elasticsearch
  labels:
    app.kubernetes.io/name: fluentd-elasticsearch
    helm.sh/chart: fluentd-elasticsearch-3.0.1
    app.kubernetes.io/instance: fluentd-elasticsearch
    app.kubernetes.io/managed-by: Tiller
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
data:
  system.conf: |-
    <system>
      root_dir /tmp/fluentd-buffers/
    </system>
  containers.input.conf: |-
    # This configuration file for Fluentd / td-agent is used
    # to watch changes to Docker log files. The kubelet creates symlinks that
    # capture the pod name, namespace, container name & Docker container ID
    # to the docker logs for pods in the /var/log/containers directory on the host.
    # If running this fluentd configuration in a Docker container, the /var/log
    # directory should be mounted in the container.
    #
    # These logs are then submitted to Elasticsearch which assumes the
    # installation of the fluent-plugin-elasticsearch & the
    # fluent-plugin-kubernetes_metadata_filter plugins.
    # See https://github.com/uken/fluent-plugin-elasticsearch &
    # https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter for
    # more information about the plugins.
    #
    # Example
    # =======
    # A line in the Docker log file might look like this JSON:
    #
    # {"log":"2014/09/25 21:15:03 Got request with path wombat\n",
    #  "stream":"stderr",
    #   "time":"2014-09-25T21:15:03.499185026Z"}
    #
    # The time_format specification below makes sure we properly
    # parse the time format produced by Docker. This will be
    # submitted to Elasticsearch and should appear like:
    # $ curl 'http://elasticsearch-logging:9200/_search?pretty'
    # ...
    # {
    #      "_index" : "logstash-2014.09.25",
    #      "_type" : "fluentd",
    #      "_id" : "VBrbor2QTuGpsQyTCdfzqA",
    #      "_score" : 1.0,
    #      "_source":{"log":"2014/09/25 22:45:50 Got request with path wombat\n",
    #                 "stream":"stderr","tag":"docker.container.all",
    #                 "@timestamp":"2014-09-25T22:45:50+00:00"}
    #    },
    # ...
    #
    # The Kubernetes fluentd plugin is used to write the Kubernetes metadata to the log
    # record & add labels to the log record if properly configured. This enables users
    # to filter & search logs on any metadata.
    # For example a Docker container's logs might be in the directory:
    #
    #  /var/lib/docker/containers/997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b
    #
    # and in the file:
    #
    #  997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b-json.log
    #
    # where 997599971ee6... is the Docker ID of the running container.
    # The Kubernetes kubelet makes a symbolic link to this file on the host machine
    # in the /var/log/containers directory which includes the pod name and the Kubernetes
    # container name:
    #
    #    synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
    #    ->
    #    /var/lib/docker/containers/997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b/997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b-json.log
    #
    # The /var/log directory on the host is mapped to the /var/log directory in the container
    # running this instance of Fluentd and we end up collecting the file:
    #
    #   /var/log/containers/synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
    #
    # This results in the tag:
    #
    #  var.log.containers.synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
    #
    # The Kubernetes fluentd plugin is used to extract the namespace, pod name & container name
    # which are added to the log message as a kubernetes field object & the Docker container ID
    # is also added under the docker field object.
    # The final tag is:
    #
    #   kubernetes.var.log.containers.synthetic-logger-0.25lps-pod_default_synth-lgr-997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b.log
    #
    # And the final log record look like:
    #
    # {
    #   "log":"2014/09/25 21:15:03 Got request with path wombat\n",
    #   "stream":"stderr",
    #   "time":"2014-09-25T21:15:03.499185026Z",
    #   "kubernetes": {
    #     "namespace": "default",
    #     "pod_name": "synthetic-logger-0.25lps-pod",
    #     "container_name": "synth-lgr"
    #   },
    #   "docker": {
    #     "container_id": "997599971ee6366d4a5920d25b79286ad45ff37a74494f262e3bc98d909d0a7b"
    #   }
    # }
    #
    # This makes it easier for users to search for logs by pod name or by
    # the name of the Kubernetes container regardless of how many times the
    # Kubernetes pod has been restarted (resulting in a several Docker container IDs).
    # Json Log Example:
    # {"log":"[info:2016-02-16T16:04:05.930-08:00] Some log text here\n","stream":"stdout","time":"2016-02-17T00:04:05.931087621Z"}
    # CRI Log Example:
    # 2016-02-17T00:04:05.931087621Z stdout F [info:2016-02-16T16:04:05.930-08:00] Some log text here
    <source>
      @id fluentd-containers.log
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/containers.log.pos
      tag raw.kubernetes.*
      read_from_head true
      <parse>
        @type multi_format
        <pattern>
          format json
          time_key time
          time_format %Y-%m-%dT%H:%M:%S.%NZ
        </pattern>
        <pattern>
          format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
          time_format %Y-%m-%dT%H:%M:%S.%N%:z
        </pattern>
      </parse>
    </source>

    # Detect exceptions in the log output and forward them as one log entry.
    <match raw.kubernetes.**>
      @id raw.kubernetes
      @type detect_exceptions
      remove_tag_prefix raw
      message log
      stream stream
      multiline_flush_interval 5
      max_bytes 500000
      max_lines 1000
    </match>

    # Concatenate multi-line logs
    <filter **>
      @id filter_concat
      @type concat
      key message
      multiline_end_regexp /\n$/
      separator ""
    </filter>

    # Enriches records with Kubernetes metadata
    <filter kubernetes.**>
      @id filter_kubernetes_metadata
      @type kubernetes_metadata
    </filter>

    # Fixes json fields in Elasticsearch
    <filter kubernetes.**>
      @id filter_parser
      @type parser
      key_name log
      reserve_data true
      remove_key_name_field true
      <parse>
        @type multi_format
        <pattern>
          format json
        </pattern>
        <pattern>
          format none
        </pattern>
      </parse>
    </filter>
  system.input.conf: |-
    # Example:
    # 2015-12-21 23:17:22,066 [salt.state       ][INFO    ] Completed state [net.ipv4.ip_forward] at time 23:17:22.066081
    <source>
      @id minion
      @type tail
      format /^(?<time>[^ ]* [^ ,]*)[^\[]*\[[^\]]*\]\[(?<severity>[^ \]]*) *\] (?<message>.*)$/
      time_format %Y-%m-%d %H:%M:%S
      path /var/log/salt/minion
      pos_file /var/log/salt.pos
      tag salt
    </source>

    # Example:
    # Dec 21 23:17:22 gke-foo-1-1-4b5cbd14-node-4eoj startupscript: Finished running startup script /var/run/google.startup.script
    <source>
      @id startupscript.log
      @type tail
      format syslog
      path /var/log/startupscript.log
      pos_file /var/log/startupscript.log.pos
      tag startupscript
    </source>

    # Examples:
    # time="2016-02-04T06:51:03.053580605Z" level=info msg="GET /containers/json"
    # time="2016-02-04T07:53:57.505612354Z" level=error msg="HTTP Error" err="No such image: -f" statusCode=404
    # TODO(random-liu): Remove this after cri container runtime rolls out.
    <source>
      @id docker.log
      @type tail
      format /^time="(?<time>[^)]*)" level=(?<severity>[^ ]*) msg="(?<message>[^"]*)"( err="(?<error>[^"]*)")?( statusCode=($<status_code>\d+))?/
      path /var/log/docker.log
      pos_file /var/log/docker.log.pos
      tag docker
    </source>

    # Example:
    # 2016/02/04 06:52:38 filePurge: successfully removed file /var/etcd/data/member/wal/00000000000006d0-00000000010a23d1.wal
    <source>
      @id etcd.log
      @type tail
      # Not parsing this, because it doesn't have anything particularly useful to
      # parse out of it (like severities).
      format none
      path /var/log/etcd.log
      pos_file /var/log/etcd.log.pos
      tag etcd
    </source>

    # Multi-line parsing is required for all the kube logs because very large log
    # statements, such as those that include entire object bodies, get split into
    # multiple lines by glog.
    # Example:
    # I0204 07:32:30.020537    3368 server.go:1048] POST /stats/container/: (13.972191ms) 200 [[Go-http-client/1.1] 10.244.1.3:40537]
    <source>
      @id kubelet.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kubelet.log
      pos_file /var/log/kubelet.log.pos
      tag kubelet
    </source>

    # Example:
    # I1118 21:26:53.975789       6 proxier.go:1096] Port "nodePort for kube-system/default-http-backend:http" (:31429/tcp) was open before and is still needed
    <source>
      @id kube-proxy.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kube-proxy.log
      pos_file /var/log/kube-proxy.log.pos
      tag kube-proxy
    </source>

    # Example:
    # I0204 07:00:19.604280       5 handlers.go:131] GET /api/v1/nodes: (1.624207ms) 200 [[kube-controller-manager/v1.1.3 (linux/amd64) kubernetes/6a81b50] 127.0.0.1:38266]
    <source>
      @id kube-apiserver.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kube-apiserver.log
      pos_file /var/log/kube-apiserver.log.pos
      tag kube-apiserver
    </source>

    # Example:
    # I0204 06:55:31.872680       5 servicecontroller.go:277] LB already exists and doesn't need update for service kube-system/kube-ui
    <source>
      @id kube-controller-manager.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kube-controller-manager.log
      pos_file /var/log/kube-controller-manager.log.pos
      tag kube-controller-manager
    </source>

    # Example:
    # W0204 06:49:18.239674       7 reflector.go:245] pkg/scheduler/factory/factory.go:193: watch of *api.Service ended with: 401: The event in requested index is outdated and cleared (the requested history has been cleared [2578313/2577886]) [2579312]
    <source>
      @id kube-scheduler.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/kube-scheduler.log
      pos_file /var/log/kube-scheduler.log.pos
      tag kube-scheduler
    </source>

    # Example:
    # I0603 15:31:05.793605       6 cluster_manager.go:230] Reading config from path /etc/gce.conf
    <source>
      @id glbc.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/glbc.log
      pos_file /var/log/glbc.log.pos
      tag glbc
    </source>

    # Example:
    # TODO Add a proper example here.
    <source>
      @id cluster-autoscaler.log
      @type tail
      format multiline
      multiline_flush_interval 5s
      format_firstline /^\w\d{4}/
      format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
      time_format %m%d %H:%M:%S.%N
      path /var/log/cluster-autoscaler.log
      pos_file /var/log/cluster-autoscaler.log.pos
      tag cluster-autoscaler
    </source>

    # Logs from systemd-journal for interesting services.
    # TODO(random-liu): Remove this after cri container runtime rolls out.
    <source>
      @id journald-docker
      @type systemd
      matches [{ "_SYSTEMD_UNIT": "docker.service" }]
      <storage>
        @type local
        persistent true
        path /var/log/journald-docker.pos
      </storage>
      read_from_head true
      tag docker
    </source>

    <source>
      @id journald-container-runtime
      @type systemd
      matches [{ "_SYSTEMD_UNIT": "{{ fluentd_container_runtime_service }}.service" }]
      <storage>
        @type local
        persistent true
        path /var/log/journald-container-runtime.pos
      </storage>
      read_from_head true
      tag container-runtime
    </source>

    <source>
      @id journald-kubelet
      @type systemd
      matches [{ "_SYSTEMD_UNIT": "kubelet.service" }]
      <storage>
        @type local
        persistent true
        path /var/log/journald-kubelet.pos
      </storage>
      read_from_head true
      tag kubelet
    </source>

    <source>
      @id journald-node-problem-detector
      @type systemd
      matches [{ "_SYSTEMD_UNIT": "node-problem-detector.service" }]
      <storage>
        @type local
        persistent true
        path /var/log/journald-node-problem-detector.pos
      </storage>
      read_from_head true
      tag node-problem-detector
    </source>

    <source>
      @id kernel
      @type systemd
      matches [{ "_TRANSPORT": "kernel" }]
      <storage>
        @type local
        persistent true
        path /var/log/kernel.pos
      </storage>
      <entry>
        fields_strip_underscores true
        fields_lowercase true
      </entry>
      read_from_head true
      tag kernel
    </source>
  forward.input.conf: |-
    # Takes the messages sent over TCP
    <source>
      @id forward
      @type forward
    </source>
  monitoring.conf: |-
    # Prometheus Exporter Plugin
    # input plugin that exports metrics
    <source>
      @id prometheus
      @type prometheus
    </source>

    <source>
      @id monitor_agent
      @type monitor_agent
    </source>

    # input plugin that collects metrics from MonitorAgent
    <source>
      @id prometheus_monitor
      @type prometheus_monitor
      <labels>
        host ${hostname}
      </labels>
    </source>

    # input plugin that collects metrics for output plugin
    <source>
      @id prometheus_output_monitor
      @type prometheus_output_monitor
      <labels>
        host ${hostname}
      </labels>
    </source>

    # input plugin that collects metrics for in_tail plugin
    <source>
      @id prometheus_tail_monitor
      @type prometheus_tail_monitor
      <labels>
        host ${hostname}
      </labels>
    </source>
  output.conf: |-
    <match **>
      @id elasticsearch
      @type elasticsearch
      @log_level info
      include_tag_key true
      type_name _doc
      host "#{ENV['OUTPUT_HOST']}"
      port "#{ENV['OUTPUT_PORT']}"
      scheme "#{ENV['OUTPUT_SCHEME']}"
      ssl_version "#{ENV['OUTPUT_SSL_VERSION']}"
      ssl_verify true
      logstash_format true
      logstash_prefix "#{ENV['LOGSTASH_PREFIX']}"
      reconnect_on_error true
      <buffer>
        @type file
        path /var/log/fluentd-buffers/kubernetes.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_forever
        retry_max_interval 30
        chunk_limit_size "#{ENV['OUTPUT_BUFFER_CHUNK_LIMIT']}"
        queue_limit_length "#{ENV['OUTPUT_BUFFER_QUEUE_LIMIT']}"
        overflow_action block
      </buffer>
    </match>
---
# Source: fluentd-elasticsearch/templates/service-account.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd-elasticsearch
  labels:
    app.kubernetes.io/name: fluentd-elasticsearch
    helm.sh/chart: fluentd-elasticsearch-3.0.1
    app.kubernetes.io/instance: fluentd-elasticsearch
    app.kubernetes.io/managed-by: Tiller
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
---
# Source: fluentd-elasticsearch/templates/clusterrole.yaml
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd-elasticsearch
  labels:
    app.kubernetes.io/name: fluentd-elasticsearch
    helm.sh/chart: fluentd-elasticsearch-3.0.1
    app.kubernetes.io/instance: fluentd-elasticsearch
    app.kubernetes.io/managed-by: Tiller
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
  - ""
  resources:
  - "namespaces"
  - "pods"
  verbs:
  - "get"
  - "watch"
  - "list"
---
# Source: fluentd-elasticsearch/templates/clusterrolebinding.yaml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd-elasticsearch
  labels:
    app.kubernetes.io/name: fluentd-elasticsearch
    helm.sh/chart: fluentd-elasticsearch-3.0.1
    app.kubernetes.io/instance: fluentd-elasticsearch
    app.kubernetes.io/managed-by: Tiller
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
  name: fluentd-elasticsearch
  namespace: logging
roleRef:
  kind: ClusterRole
  name: fluentd-elasticsearch
  apiGroup: rbac.authorization.k8s.io
---
# Source: fluentd-elasticsearch/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
  labels:
    app.kubernetes.io/name: fluentd-elasticsearch
    helm.sh/chart: fluentd-elasticsearch-3.0.1
    app.kubernetes.io/instance: fluentd-elasticsearch
    app.kubernetes.io/managed-by: Tiller
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  updateStrategy:
    type: RollingUpdate

  selector:
    matchLabels:
      app.kubernetes.io/name: fluentd-elasticsearch
      app.kubernetes.io/instance: fluentd-elasticsearch
  template:
    metadata:
      labels:
        app.kubernetes.io/name: fluentd-elasticsearch
        helm.sh/chart: fluentd-elasticsearch-3.0.1
        app.kubernetes.io/instance: fluentd-elasticsearch
        app.kubernetes.io/managed-by: Tiller
        kubernetes.io/cluster-service: "true"
      annotations:
        # This annotation ensures that fluentd does not get evicted if the node
        # supports critical pod annotation based priority scheme.
        # Note that this does not guarantee admission on the nodes (#40573).
        # NB! this annotation is deprecated as of version 1.13 and will be removed in 1.14.
        # ref: https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/
        scheduler.alpha.kubernetes.io/critical-pod: ''
        checksum/config: ae070472c960fbf8a8fd16a51455814c6ba9d07842f2f23551de68d7ba4233bb
    spec:
      serviceAccountName: fluentd-elasticsearch
      containers:
      - name: fluentd-elasticsearch
        image:  "gcr.io/fluentd-elasticsearch/fluentd:v2.5.2"
        imagePullPolicy: "IfNotPresent"
        env:
        - name: FLUENTD_ARGS
          value: "--no-supervisor -q"
        - name: OUTPUT_HOST
          value: "elasticsearch-master"
        - name: OUTPUT_PORT
          value: "9200"
        - name: LOGSTASH_PREFIX
          value: "logstash"
        - name: OUTPUT_SCHEME
          value: "https"
        - name: OUTPUT_SSL_VERSION
          value: "TLSv1_2"
        - name: OUTPUT_BUFFER_CHUNK_LIMIT
          value: "2M"
        - name: OUTPUT_BUFFER_QUEUE_LIMIT
          value: "8"
        - name: ELASTICSEARCH_PASSWORD
          valueFrom:
            secretKeyRef:
              name: elastic-credentials
              key: "password"
        - name: K8S_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        resources:
          {}

        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: libsystemddir
          mountPath: /usr/lib64
          readOnly: true
        - name: config-volume
          mountPath: /etc/fluent/config.d
        - mountPath: /certs
          name: es-certs
          readOnly: true
          #pointing to fluentd Dockerfile
        # Liveness probe is aimed to help in situarions where fluentd
        # silently hangs for no apparent reasons until manual restart.
        # The idea of this probe is that if fluentd is not queueing or
        # flushing chunks for 5 minutes, something is not right. If
        # you want to change the fluentd configuration, reducing amount of
        # logs fluentd collects, consider changing the threshold or turning
        # liveness probe off completely.
        livenessProbe:
          initialDelaySeconds: 600
          periodSeconds: 60
          exec:
            command:
            - '/bin/sh'
            - '-c'
            - >
              LIVENESS_THRESHOLD_SECONDS=${LIVENESS_THRESHOLD_SECONDS:-300};
              STUCK_THRESHOLD_SECONDS=${STUCK_THRESHOLD_SECONDS:-900};
              if [ ! -e /var/log/fluentd-buffers ];
              then
                exit 1;
              fi;
              touch -d "${STUCK_THRESHOLD_SECONDS} seconds ago" /tmp/marker-stuck;
              if [ -z "$(find /var/log/fluentd-buffers -type d -newer /tmp/marker-stuck -print -quit)" ];
              then
                rm -rf /var/log/fluentd-buffers;
                exit 1;
              fi;
              touch -d "${LIVENESS_THRESHOLD_SECONDS} seconds ago" /tmp/marker-liveness;
              if [ -z "$(find /var/log/fluentd-buffers -type d -newer /tmp/marker-liveness -print -quit)" ];
              then
                exit 1;
              fi;
        ports:
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      # It is needed to copy systemd library to decompress journals
      - name: libsystemddir
        hostPath:
          path: /usr/lib64
      - name: config-volume
        configMap:
          name: fluentd-elasticsearch
      - name: es-certs
        secret:
          defaultMode: 420
          secretName: elastic-certificate-pem

What happened:

I tried to setup the chart by passing in the CA certificate I use with kibana in the pem format. When the pods start up they crash with this error:

2019-06-10 15:10:36 +0000 [error]: unexpected error error_class=Faraday::SSLError error="SSL_connect returned=1 errno=0 state=error: certificate verify failed (OpenSSL::SSL::SSLError) Unable to verify certificate. This may be an issue with the remote host or with Excon. Excon has certificates bundled, but these can be customized:\n\n Excon.defaults[:ssl_ca_path] = path_to_certs\n ENV['SSL_CERT_DIR'] = path_to_certs\n Excon.defaults[:ssl_ca_file] = path_to_file\n ENV['SSL_CERT_FILE'] = path_to_file\n Excon.defaults[:ssl_verify_callback] = callback\n (see OpenSSL::SSL::SSLContext#verify_callback)\nor:\n Excon.defaults[:ssl_verify_peer] = false (less secure).\n"

The docs are unclear on what format the cert is needed in and where it should be placed. It seems just giving it what kibana uses is not correct.

What you expected to happen:

Pods start successfully and communicate with ES.

How to reproduce it (as minimally and precisely as possible):

I installed a secure version of ES off this chart:

https://github.com/elastic/helm-charts/tree/6.5.2-alpha1/elasticsearch#security

ES seems to working, the cluster is green. I also have been able to get kibana to talk to it.

Anything else we need to know:

monotek commented 5 years ago

elasticsearch:
  auth:
    enabled: false

needs to be true.

Butters646 commented 5 years ago

I thought I had that set. Retried it with enabled: true.

I see these env vars configured in the daemon set:

          {
            "name": "OUTPUT_USER",
            "value": "elastic"
          },
          {
            "name": "OUTPUT_PASSWORD",
            "value": "mypassword"
          },

I am still getting the same error in the pods however:

2019-06-12 15:12:23 +0000 [error]: unexpected error error_class=Faraday::SSLError error="SSL_connect returned=1 errno=0 state=error: certificate verify failed (OpenSSL::SSL::SSLError) Unable to verify certificate. This may be an issue with the remote host or with Excon. Excon has certificates bundled, but these can be customized:\n\n Excon.defaults[:ssl_ca_path] = path_to_certs\n ENV['SSL_CERT_DIR'] = path_to_certs\n Excon.defaults[:ssl_ca_file] = path_to_file\n ENV['SSL_CERT_FILE'] = path_to_file\n Excon.defaults[:ssl_verify_callback] = callback\n (see OpenSSL::SSL::SSLContext#verify_callback)\nor:\n Excon.defaults[:ssl_verify_peer] = false (less secure).\n

monotek commented 5 years ago

Do you use a self signed cert? If so, "ssl_verify true" should be false. I'll add an config option for this...

@bartlettc22 as you've implemented this in https://github.com/kiwigrid/helm-charts/pull/41/files and i don't use an ssl secured es server, can you help here?

Butters646 commented 5 years ago

I am not using self signed certs. I use certs that I generated via the ES cert utility.

https://www.elastic.co/guide/en/elasticsearch/reference/current/certutil.html

First I generated the CA which I used to create the cert for the ES. For kibana I created a secret with the CA cert in pem format. There were specific values you needed to use with kibana chart to get the cert in the right spot on the pod, but I was able to get it working.

I was looking for something similar to do with the fluentd scraper, but couldn't figure out how to do it.

monotek commented 5 years ago

This pretty much sounds like you're creating a self signed cert as you use your own CA.

If you want to use a real cert create a CSR and send it to a trusted CA. This will cost you money.

Workaround could be a free letsencrypt certificate.

See: https://www.elastic.co/de/blog/x-pack-security-for-elasticsearch-with-lets-encrypt-certificates

Butters646 commented 5 years ago

Ok. I guess that setting says to ignore ssl cert errors? I went in and manually changed the configmap to set ssl_verify false. The pods now startup fine and go green but I see this error in the logs:

2019-06-12 23:39:18 +0000 [warn]: [elasticsearch] failed to flush the buffer. retry_time=1 next_retry_seconds=2019-06-12 23:39:19 +0000 chunk="58b28eb0053b15224b7edc2b969a3cd6" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"elasticsearch-master\", :port=>9200, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\"}): read timeout reached"

I double checked the username and password on the pod's env are the correct creds for the es cluster.

monotek commented 5 years ago

No, if ssl_verify = false it allows self signed certificates. If you have questions about this, ask in the plugin repo issue tracker: https://github.com/uken/fluent-plugin-elasticsearch#user-password-path-scheme-ssl_verify

You're pushing your logs to "elasticsearch-master". Seems wrong to me. At least it would be in the stable/elasticsearch chart. Should be "elasticsearch-client".

In version 4.0.0 of the chart there is a sslVerify config flag available now. See: https://github.com/kiwigrid/helm-charts/pull/121

Butters646 commented 5 years ago

I am using the ES helm chart for the official ES docker image - https://github.com/elastic/helm-charts/tree/master/elasticsearch

It creates a stateful set with three pods and a service:

apiVersion: v1
kind: Service
metadata:
  creationTimestamp: "2019-05-31T22:20:34Z"
  name: elasticsearch-master
  namespace: logging
  resourceVersion: "49137365"
  selfLink: /api/v1/namespaces/logging/services/elasticsearch-master
  uid: 4f2cf835-83f2-11e9-8482-0279ebfe8b36
spec:
  clusterIP: 10.16.30.141
  ports:
  - name: http
    port: 9200
    protocol: TCP
    targetPort: 9200
  - name: transport
    port: 9300
    protocol: TCP
    targetPort: 9300
  selector:
    app: elasticsearch-master
    chart: elasticsearch-7.1.0
    heritage: Tiller
    release: elasticsearch
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

monotek commented 5 years ago

I've edited your post again. Please use code tags in the future ;-)

It seems the current fluentd / fluentd-elasticsearch plugin is not supporting ES 7.1 yet. Could you test with ES version 6.7 please?

I'll update the image in a bit.

monotek commented 5 years ago

Please test with the newest build of the image from my private repo.

registry.hub.docker.com/monotek/fluentd-elasticsearch:28

PR to kubernetes repo: https://github.com/kubernetes/kubernetes/pull/79014

Butters646 commented 5 years ago

It's working now with your image. Thank you so much for your help!

One thing I did notice is when set sslVerify: false under elasticsearch in the values.yaml file, the output.conf file in the configmap still had ssl_verify true in it. I had to edit it manually again. Maybe I wasn't using the latest version of chart? I just ran the install command again. Did I need to update something? (I am not a helm expert.)

monotek commented 5 years ago

You need to do a "helm repo update" before the install so the newest version of the chart is used.

monotek commented 5 years ago

Can we close this? The updated image will be avialable via this pr: https://github.com/kiwigrid/helm-charts/pull/125

Butters646 commented 5 years ago

Yes. I think my issues have been resolved. Thanks again.

kiwigrid / helm-charts

Unable to connect to a secure ES cluster #116