Closed kaiohenricunha closed 1 year ago
I have found a solution.
It seems that fluentd refuses fluentbit connection if it can't connect to OpenSearch beforehand.
I was sending logs to OpenSearch on port 9200(http). After asking ChatGPT, it suggested using port 443(https), which I did.
Pinging OpenSearch from the node and from the pod on port 443 was the only request that worked.
So, I just added port 443 and scheme https to values.yaml. After that, logs starded popping up on OpenSearch Dashboards(Kibana). It ended like this:
# Default values for fluentbit-operator.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
#Set this to containerd or crio if you want to collect CRI format logs
containerRuntime: docker
# If you want to deploy a default Fluent Bit pipeline (including Fluent Bit Input, Filter, and output) to collect Kubernetes logs, you'll need to set the Kubernetes parameter to true
# see https://github.com/fluent/fluent-operator/tree/master/manifests/logging-stack
Kubernetes: true
operator:
# The init container is to get the actual storage path of the docker log files so that it can be mounted to collect the logs.
# see https://github.com/fluent/fluent-operator/blob/master/manifests/setup/fluent-operator-deployment.yaml#L26
initcontainer:
repository: "docker"
tag: "20.10"
container:
repository: "kubesphere/fluent-operator"
tag: "latest"
# FluentBit operator resources. Usually user needn't to adjust these.
resources:
limits:
cpu: 100m
memory: 60Mi
requests:
cpu: 100m
memory: 20Mi
# Specify custom annotations to be added to each Fluent Operator pod.
annotations: {}
## Reference to one or more secrets to be used when pulling images
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
imagePullSecrets: []
# - name: "image-pull-secret"
# Reference one more key-value pairs of labels that should be attached to fluent-operator
labels: {}
# myExampleLabel: someValue
logPath:
# The operator currently assumes a Docker container runtime path for the logs as the default, for other container runtimes you can set the location explicitly below.
# crio: /var/log
containerd: /var/log
fluentbit:
image:
repository: "kubesphere/fluent-bit"
tag: "v2.0.9"
# fluentbit resources. If you do want to specify resources, adjust them as necessary
#You can adjust it based on the log volume.
resources:
limits:
cpu: 500m
memory: 200Mi
requests:
cpu: 10m
memory: 25Mi
# Specify custom annotations to be added to each FluentBit pod.
annotations: {}
## Request to Fluent Bit to exclude or not the logs generated by the Pod.
# fluentbit.io/exclude: "true"
## Prometheus can use this tag to automatically discover the Pod and collect monitoring data
# prometheus.io/scrape: "true"
# Specify additional custom labels for fluentbit-pods
labels: {}
## Reference to one or more secrets to be used when pulling images
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
##
imagePullSecrets: [ ]
# - name: "image-pull-secret"
secrets: []
# List of volumes that can be mounted by containers belonging to the pod.
additionalVolumes: []
# Pod volumes to mount into the container's filesystem.
additionalVolumesMounts: []
# Remove the above empty volumes and volumesMounts, and then set additionalVolumes and additionalVolumesMounts as below if you want to collect node exporter metrics
# additionalVolumes:
# - name: hostProc
# hostPath:
# path: /proc/
# - name: hostSys
# hostPath:
# path: /sys/
# additionalVolumesMounts:
# - mountPath: /host/sys
# mountPropagation: HostToContainer
# name: hostSys
# readOnly: true
# - mountPath: /host/proc
# mountPropagation: HostToContainer
# name: hostProc
# readOnly: true
#Set a limit of memory that Tail plugin can use when appending data to the Engine.
# You can find more details here: https://docs.fluentbit.io/manual/pipeline/inputs/tail#config
#If the limit is reach, it will be paused; when the data is flushed it resumes.
#if the inbound traffic is less than 2.4Mbps, setting memBufLimit to 5MB is enough
#if the inbound traffic is less than 4.0Mbps, setting memBufLimit to 10MB is enough
#if the inbound traffic is less than 13.64Mbps, setting memBufLimit to 50MB is enough
input:
tail:
memBufLimit: 5MB
nodeExporterMetrics: {}
# uncomment below nodeExporterMetrics section if you want to collect node exporter metrics
# nodeExporterMetrics:
# tag: node_metrics
# scrapeInterval: 15s
# path:
# procfs: /host/proc
# sysfs: /host/sys
#Configure the output plugin parameter in FluentBit.
#You can set enable to true to output logs to the specified location.
output:
# You can find more supported output plugins here: https://github.com/fluent/fluent-operator/tree/master/docs/plugins/fluentbit/clusteroutput
es:
enable: false
host: "<Elasticsearch url like elasticsearch-logging-data.kubesphere-logging-system.svc>"
port: 9200
logstashPrefix: ks-logstash-log
# path: ""
# bufferSize: "4KB"
# index: "fluent-bit"
# httpUser:
# httpPassword:
# logstashFormat: true
# replaceDots: false
# enableTLS: false
# tls:
# verify: On
# debug: 1
# caFile: "<Absolute path to CA certificate file>"
# caPath: "<Absolute path to scan for certificate files>"
# crtFile: "<Absolute path to private Key file>"
# keyFile: "<Absolute path to private Key file>"
# keyPassword:
# vhost: "<Hostname to be used for TLS SNI extension>"
kafka:
enable: false
brokers: "<kafka broker list like xxx.xxx.xxx.xxx:9092,yyy.yyy.yyy.yyy:9092>"
topics: ks-log
opentelemetry: {}
# You can configure the opentelemetry-related configuration here
opensearch: {}
# You can configure the opensearch-related configuration here
stdout:
enable: true
# forward: # {{- if .Values.Kubernetes -}} {{- if .Values.fluentd.enable -}}
# host: fluentd.fluent.svc.cluster.local # host: {{ .Values.fluentd.name }}.{{ .Release.Namespace }}.svc on fluentbit-output-forward.yaml
# port: 24224 # {{ .Values.fluentd.forward.port }}
#Configure the default filters in FluentBit.
# The `filter` will filter and parse the collected log information and output the logs into a uniform format. You can choose whether to turn this on or not.
filter:
kubernetes:
enable: true
labels: true
annotations: true
containerd:
# This is customized lua containerd log format converter, you can refer here:
# https://github.com/fluent/fluent-operator/blob/master/charts/fluent-operator/templates/fluentbit-clusterfilter-containerd.yaml
# https://github.com/fluent/fluent-operator/blob/master/charts/fluent-operator/templates/fluentbit-containerd-config.yaml
enable: false
systemd:
enable: false
fluentd:
enable: true
name: fluentd
port: 24224 # port: {{ .Values.fluentd.port }} on fluentd-fluentd.yaml
image:
repository: "kubesphere/fluentd"
tag: "v1.15.3"
replicas: 1
forward:
port: 24224 # port: {{ .Values.fluentd.forward.port }} on fluentbit-output-forward.yaml
watchedNamespaces:
- fluent
- observability-system
- default
resources:
limits:
cpu: 500m
memory: 500Mi
requests:
cpu: 100m
memory: 128Mi
# Configure the output plugin parameter in Fluentd.
# Fluentd is disabled by default, if you enable it make sure to also set up an output to use.
output:
es:
enable: false
host: elasticsearch-logging-data.kubesphere-logging-system.svc
port: 9200
logstashPrefix: ks-logstash-log
buffer:
enable: false
type: file
path: /buffers/es
kafka:
enable: false
brokers: "my-cluster-kafka-bootstrap.default.svc:9091,my-cluster-kafka-bootstrap.default.svc:9092,my-cluster-kafka-bootstrap.default.svc:9093"
topicKey: kubernetes_ns
buffer:
enable: false
type: file
path: /buffers/kafka
stdout:
enable: true
opensearch:
enable: true
host: vpc-XXXXX-us-west-2-XXXXXXXX.us-west-2.es.amazonaws.com
port: 443
logstashPrefix: logs
scheme: https
# buffer:
# enable: false
# type: file
# path: /buffers/opensearch
nameOverride: ""
fullnameOverride: ""
namespaceOverride: ""
Keep in mind that fluentd is running on Kubernetes cluster(EKS).
Another issue that I had to face was that after upgrading the fluent-operator release, the changes weren't applied to the fluentd pod.
This is because the fluentd template doesn't handle parameters like scheme
.
But the CRD does: https://github.com/fluent/helm-charts/blob/main/charts/fluent-operator/crds/fluentd.fluent.io_clusteroutputs.yaml#L1411 .
So, I just had to apply this change manually and then kill the fluentd pod. After that, the pod recognized the changes and rendered the https scheme:
kubectl get clusteroutput fluentd-output-opensearch -o yaml
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
annotations:
meta.helm.sh/release-name: fluent-operator
meta.helm.sh/release-namespace: fluent
creationTimestamp: "2023-02-15T20:35:26Z"
generation: 2
labels:
app.kubernetes.io/managed-by: Helm
output.fluentd.fluent.io/enabled: "true"
name: fluentd-output-opensearch
resourceVersion: "14073767"
uid: 9705d00f-5c10-4b32-916c-f6a487a3ac70
spec:
outputs:
- opensearch:
host: vpc-XXXXX-us-west-2-XXXXXX.us-west-2.es.amazonaws.com
logstashFormat: true
logstashPrefix: logs
port: 443
scheme: https
The issue
I have been trying to use the fluent-operator to deploy fluentbit and fluentd in a multi-tenant scenario in EKS cluster.
The goal is to collect logs with fluentbit and then forward to fluentd to process and send to OpenSearch.
The logs are being collected by fluentbit, but then fluentbit pod logs the following error when trying to communicate with fluentd:
The configuration of fluentd-output-opensearch, fluentd service, fluentbit service, clusteroutput.fluentbit, fluentd pod and fluentbit pod seem ok:
Also, the fluentd globalInputs seem to be correct for fluentd forward inputs:
I have all fluentbit, fluentd and fluent-operator pods up and running in the same namespace.
Why am I getting this error?
Steps to reproduce
I installed the fluent-operator via Helm:
The values.yaml has the following configuration: