fluent / helm-charts

Helm Charts for Fluentd and Fluent Bit
Apache License 2.0
383 stars 459 forks source link

Could not communicate to Elasticsearch, resetting connection and trying again. EOFError (EOFError)[Fluentd using helm on kubernetes] #382

Open zoroglu opened 1 year ago

zoroglu commented 1 year ago

I'm installing elasticsearch, kibana and fluentd in kubernetes with helm chart Elasticsearch and kibana pods stand up smoothly, but fluentd pods don't stand up I get the following errors:

2023-06-13 13:29:39 +0000 [warn]: #0 [filter_kube_metadata] !! The environment variable 'K8S_NODE_NAME' is not set to the node name which can affect the API server and watch efficiency !!
2023-06-13 13:29:39 +0000 [info]: adding match in @KUBERNETES pattern="**" type="relabel"
2023-06-13 13:29:39 +0000 [info]: adding filter in @DISPATCH pattern="**" type="prometheus"
2023-06-13 13:29:39 +0000 [info]: adding match in @DISPATCH pattern="**" type="relabel"
2023-06-13 13:29:39 +0000 [info]: adding match in @OUTPUT pattern="**" type="elasticsearch"
2023-06-13 13:29:41 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. EOFError (EOFError)
2023-06-13 13:29:41 +0000 [warn]: #0 Remaining retry: 14. Retry to communicate after 2 second(s).
2023-06-13 13:29:45 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. EOFError (EOFError)
2023-06-13 13:29:45 +0000 [warn]: #0 Remaining retry: 13. Retry to communicate after 4 second(s).
2023-06-13 13:29:53 +0000 [warn]: #0 Could not communicate to Elasticsearch, resetting connection and trying again. EOFError (EOFError)
2023-06-13 13:29:53 +0000 [warn]: #0 Remaining retry: 12. Retry to communicate after 8 second(s).

command for elasticsearch: helm install elasticsearch-master elastic/elasticsearch -n efk

command for fluentd: helm install fluentd fluent/fluentd --set elasticsearch.password=22qz66p1gsr8ELA1 -n efk

command for kibana: helm install kibana elastic/kibana -n efk

elasticsearch image:

Image:       docker.elastic.co/elasticsearch/elasticsearch:8.5.1
Ports:       9200/TCP, 9300/TCP
Host Ports:  0/TCP, 0/TCP

kibana image:

  Image:      docker.elastic.co/kibana/kibana:8.5.1
    Port:       5601/TCP
    Host Port:  0/TCP

` fluentd image:

Image:      fluent/fluentd-kubernetes-daemonset:v1.15.2-debian-elasticsearch7-1.0
Port:       24231/TCP
Host Port:  0/TCP

fluentd config map elasticsearch output

04_outputs.conf: |-
    <label @OUTPUT>
      <match **>
        @type elasticsearch
        host elasticsearch-master
        port 9200
        path ""
        user elastic
        password 22qz66p1gsr8ELA1
      </match>
    </label>

not connecting to elasticsearch

tspearconquest commented 1 year ago

What does your elasticsearch service look like?

kubectl get svc -n efk find the service for your elasticsearch master and do:

kubectl get svc <service name> -n efk -o yaml and provide that output.

zoroglu commented 1 year ago

What does your elasticsearch service look like?

kubectl get svc -n efk find the service for your elasticsearch master and do:

kubectl get svc <service name> -n efk -o yaml and provide that output.

kubectl get svc -n efk

NAME                            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
elasticsearch-master            ClusterIP   10.12.8.24   <none>        9200/TCP,9300/TCP   153m
elasticsearch-master-headless   ClusterIP   None            <none>        9200/TCP,9300/TCP   153m
fluentd                         ClusterIP   10.12.9.32   <none>        24231/TCP           146m
kibana-kibana                   ClusterIP   10.12.8.45    <none>        5601/TCP            133m

#kubectl get svc elasticsearch-master -n efk -o yaml

apiVersion: v1
kind: Service
metadata:
  annotations:
    meta.helm.sh/release-name: elasticsearch
    meta.helm.sh/release-namespace: efk
  creationTimestamp: "2023-06-13T12:49:55Z"
  labels:
    app: elasticsearch-master
    app.kubernetes.io/managed-by: Helm
    chart: elasticsearch
    heritage: Helm
    release: elasticsearch
  name: elasticsearch-master
  namespace: efk
  resourceVersion: "42070929"
  uid: 74ab95fe-cc2d-450e-98f8-9d0a734b5d40
spec:
  clusterIP: 10.12.8.24
  clusterIPs:
  - 10.12.8.24
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: http
    port: 9200
    protocol: TCP
    targetPort: 9200
  - name: transport
    port: 9300
    protocol: TCP
    targetPort: 9300
  selector:
    app: elasticsearch-master
    chart: elasticsearch
    release: elasticsearch
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

# kubectl get svc elasticsearch-master-headless -n efk -o yaml

apiVersion: v1
kind: Service
metadata:
  annotations:
    meta.helm.sh/release-name: elasticsearch
    meta.helm.sh/release-namespace: efk
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
  creationTimestamp: "2023-06-13T12:49:55Z"
  labels:
    app: elasticsearch-master
    app.kubernetes.io/managed-by: Helm
    chart: elasticsearch
    heritage: Helm
    release: elasticsearch
  name: elasticsearch-master-headless
  namespace: efk
  resourceVersion: "42045239"
  uid: 9c159b82-cc8e-4915-95a1-7b3e0d206f78
spec:
  clusterIP: None
  clusterIPs:
  - None
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: http
    port: 9200
    protocol: TCP
    targetPort: 9200
  - name: transport
    port: 9300
    protocol: TCP
    targetPort: 9300
  publishNotReadyAddresses: true
  selector:
    app: elasticsearch-master
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}
tspearconquest commented 1 year ago

This looks fine. Are there any network policies or service mesh in play?

zoroglu commented 1 year ago

This looks fine. Are there any network policies or service mesh in play?

No, I can connect with kibana to elasticsearch.

Feederhigh5 commented 1 year ago

I had the same issue and found the cause to be the tls encryption. Rather than disabling the tls communication for elasticsearch, I added the self-signed certificates to the fluentd config:

  04_outputs.conf: |-
    <label @OUTPUT>
      <match **>
        @type elasticsearch
        host "elasticsearch-master"
        port 9200
        scheme https
        path ""
        user elastic
        password retracted
        ssl_verify true
        ssl_version TLSv1_2
        ca_file "/fluentd/elastic/ca.crt"
        client_cert "/fluentd/elastic/tls.crt"
        client_key "/fluentd/elastic/tls.key" 
      </match>
    </label>

In order for this to work, I also had to mount the elasticsearch certs into the fluentd pod:

volumes:
  - name: elasticsearch-cert
    secret:
      secretName: elasticsearch-master-certs

volumeMounts:
  - name: elasticsearch-cert
    mountPath: /fluentd/elastic/

This fixed the communication between fluentd and elasticsearch.

FYI: Afterwards I ran into another problem, since the liveness and readiness probe of the fluentd pod kept failing. The issue was, that by default the prometheus configmap was commented out.

configMapConfigs:
  - fluentd-prometheus-conf

Adding it fixed the issue and fluentd is finally up and running again.

Hope this helps :)