EFK: Add logs aggregation layer based on fluentd

Enhancement Request

Add logs aggregation layer to logging architecture. From this layer logs can be aggregated, filtered and routed to different destinations to further processing (elasticsearch, kafka, s3, etc.)

log-architecture

source: Common architecture patterns with fluentd and fluentbit

Implementation details

Log aggregation layer can be based on fluentd or fluentbit. Both of them can be used as Log forwarders and Log aggregators fluentbit documentation. The difference is only in the number of plugins (input, output, etc) available.

Fluentbit does not support kafka input plugin (only output). Fluentd supports kafka integration as input and output. Fluentd should be the right choice for log aggregation layer, in case the logging architecture evolve in future to have a Kafka cluster as buffer mechanism between log forwarders and log aggregators.

Kafka as log buffering

source: One Year of Log Management at Vinted

Changes to the current logging architecture:

Fluentbit collectors need to be reconfigured to forward logs to the fluentd aggregator.
Fluentd need to be deployed in kubernetes cluster not as daemon set but as a deployment. See an example of how to do it here
Fluentd aggregator need to be configured to forward the logs to elasticsearch.
Fluentd aggregator can be configured to forward logs as well to a Kafka topic.

Fluentd also need to be configured to export Prometheus metrics. See [fluentd documentation] (https://docs.fluentd.org/monitoring-fluentd/monitoring-prometheus)

About exposing fluentd forwarder service to collect logs outside the cluster

Make fluentd forwarder port available from outside the cluster for collecting logs coming from external hosts (i.e. gateway) and remove the current exposure of ES service. Communications with fluentd exposed service must be secured. TLS need to be enabled in the exposed service for encrypting the communications and authentication mechanism need to be activated. Within the cluster Linkerd is already encrypting inter-pod communications, but authentication mechanism must be provided between fluentbit forwarders and fluentd aggregator.

For receiving logs outside the cluster, TLS need to be enabled anyway. TLS certificate can be automatically generated by cert-manager.

<source>
  @type forward
  port 24224
  bind 0.0.0.0
  <transport tls>
     cert_path /fluentd/certs/tls.crt
     private_key_path /fluentd/certs/tls.key
  </transport
 <security>
      self_hostname fluend-aggregator
      shared_key s1cret0
  </security>
</source>

About TLS certificate generation and loading in fluentd POD

1) Generate fluentd TLS certificate with certmanager using custom cluster CA.

  apiVersion: cert-manager.io/v1
  kind: Certificate
  metadata:
    name: fluentd-tls
    namespace: k3s-logging
  spec:
    # Secret names are always required.
    secretName: fluentd-tls
    duration: 2160h # 90d
    renewBefore: 360h # 15d
    commonName: fluentd.picluster.ricsanfre.com
    isCA: false
    privateKey:
      algorithm: ECDSA
      size: 256
    usages:
      - server auth
      - client auth
    # At least one of a DNS Name, URI, or IP address is required.
    dnsNames:
      - fluentd.picluster.ricsanfre.com
    # ClusterIssuer: ca-issuer.
    issuerRef:
      name: ca-issuer
      kind: ClusterIssuer
      group: cert-manager.io

Certmanager will create a TLS Secret:

   apiVersion: v1
   kind: Secret
   metadata:
     name: fluentd-tls
     namespace: k3s-logging
   data:
     tls.crt: base64 encoded cert
     tls.key: base64 encoded key
   type: kubernetes.io/tls

2) That certificate can be mounted in fluentd pod as volume /fluentd/certs

```yml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: fluentd
  name: fluentd
  namespace: k3s-logging
spec:
  replicas: 1
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      containers:
      - image: "{{ efk_fluentd_aggregator_image }}"
        imagePullPolicy: Always
        name: fluentd
        env:
          # Elastic operator creates elastic service name with format cluster_name-es-http
          - name:  FLUENT_ELASTICSEARCH_HOST
            value: efk-es-http
            # Default elasticsearch default port
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "9200"
          # Elasticsearch user
          - name: FLUENT_ELASTICSEARCH_USER
            value: "elastic"
          # Elastic operator stores elastic user password in a secret
          - name: FLUENT_ELASTICSEARCH_PASSWORD
            valueFrom:
              secretKeyRef:
                name: "efk-es-elastic-user"
                key: elastic
          # Setting a index-prefix for fluentd. By default index is logstash
          - name:  FLUENT_ELASTICSEARCH_INDEX_NAME
            value: fluentd
          - name: FLUENT_ELASTICSEARCH_LOG_ES_400_REASON
            value: "true"
        ports:
        - containerPort: 24224
          name: forward
          protocol: TCP
        - containerPort: 24231
          name: prometheus
          protocol: TCP
        volumeMounts:
        - mountPath: /fluentd/etc
          name: config
          readOnly: true
        - mountPath: "/fluentd/certs"
          name: fluentd-tls
          readOnly: true
      volumes:
      - configMap:
          defaultMode: 420
          name: fluentd-config
        name: config
```

About production ready forwarder/aggregator configuration

Fluentbit and fluentd filesystem buffering mechanisms should be enabled.
- Fluentbit filesystem input buffering
- Fluentd filesystem output buffering
Fluentd aggregator should be deployed in HA, Kubernetes deployment with several replicas. Kubernetes HPA (Horizontal POD Autoscaler) could be configured to automatically scale the number of replicas.
Fluentd could be deployed as Statefulset instead of Deployment with dedicated pvc for disk buffer. This way if pod is terminated buffer information is not lost.

About the use of official helm fluentd chart

fluentd official helm chart also supports the deployment of fluentd as deployment or statefulset instead of daemonset. In case of deployment HPA is also supported.

values.yml could be something like this:

# Deploy fluentd as deployment
kind: "Deployment"
# Number of replicas
replicaCount: 1
# Enabling HPA
autoscaling:
  enabled: true
  minReplicas: 1
  maxReplicas: 100
  targetCPUUtilizationPercentage: 80

# Do not create serviceAccount, RBAC and podSecurityPolicy objects
serviceAccount:
  create: false
rbac:
  create: false
podSecurityPolicy:
  enabled: false

## Additional environment variables to set for fluentd pods
env:
  ...

# Volumes and VolumeMounts (only configuration files and certificates)
volumes:
- name: etcfluentd-main
  configMap:
    name: fluentd-main
    defaultMode: 0777
- name: etcfluentd-config
  configMap:
    name: fluentd-config
    defaultMode: 0777
- name: fluentd-tls
  secret:
    secretName: fluentd-tls

volumeMounts:
- name: etcfluentd-main
  mountPath: /etc/fluent
- name: etcfluentd-config
  mountPath: /etc/fluent/config.d/
- mountPath: /fluentd/certs
  name: fluentd-tls
  readOnly: true

service:
  type: "ClusterIP"
  annotations: {}
  # loadBalancerIP:
  # externalTrafficPolicy: Local
  ports:
  - name: "forwarder"
    protocol: TCP
    containerPort: 24224
  - name: prometheus
    containerPort: 24231
    protocol: TCP

## Fluentd list of plugins to install
##
plugins: []
# - fluent-plugin-out-http

## Add fluentd config files from K8s configMaps
##
configMapConfigs:
  - fluentd-prometheus-conf
# - fluentd-systemd-conf

## Fluentd configurations:
##
fileConfigs:
  01_sources.conf: |-
    ## logs from podman
    <source>
      @type tail
      @id in_tail_container_logs
      @label @KUBERNETES
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag kubernetes.*
      read_from_head true
      <parse>
        @type multi_format
        <pattern>
          format json
          time_key time
          time_type string
          time_format "%Y-%m-%dT%H:%M:%S.%NZ"
          keep_time_key false
        </pattern>
        <pattern>
          format regexp
          expression /^(?<time>.+) (?<stream>stdout|stderr)( (.))? (?<log>.*)$/
          time_format '%Y-%m-%dT%H:%M:%S.%NZ'
          keep_time_key false
        </pattern>
      </parse>
      emit_unmatched_lines true
    </source>
  02_filters.conf: |-
    <label @KUBERNETES>
      <match kubernetes.var.log.containers.fluentd**>
        @type relabel
        @label @FLUENT_LOG
      </match>
      # <match kubernetes.var.log.containers.**_kube-system_**>
      #   @type null
      #   @id ignore_kube_system_logs
      # </match>
      <filter kubernetes.**>
        @type kubernetes_metadata
        @id filter_kube_metadata
        skip_labels false
        skip_container_metadata false
        skip_namespace_metadata true
        skip_master_url true
      </filter>
      <match **>
        @type relabel
        @label @DISPATCH
      </match>
    </label>
  03_dispatch.conf: |-
    <label @DISPATCH>
      <filter **>
        @type prometheus
        <metric>
          name fluentd_input_status_num_records_total
          type counter
          desc The total number of incoming records
          <labels>
            tag ${tag}
            hostname ${hostname}
          </labels>
        </metric>
      </filter>
      <match **>
        @type relabel
        @label @OUTPUT
      </match>
    </label>
  04_outputs.conf: |-
    <label @OUTPUT>
      <match **>
        @type elasticsearch
        host "elasticsearch-master"
        port 9200
        path ""
        user elastic
        password changeme
      </match>
    </label>

ricsanfre / pi-cluster