Closed ricsanfre closed 2 years ago
Fluentd also need to be configured to export Prometheus metrics. See [fluentd documentation] (https://docs.fluentd.org/monitoring-fluentd/monitoring-prometheus)
Make fluentd forwarder port available from outside the cluster for collecting logs coming from external hosts (i.e. gateway) and remove the current exposure of ES service. Communications with fluentd exposed service must be secured. TLS need to be enabled in the exposed service for encrypting the communications and authentication mechanism need to be activated. Within the cluster Linkerd is already encrypting inter-pod communications, but authentication mechanism must be provided between fluentbit forwarders and fluentd aggregator.
For receiving logs outside the cluster, TLS need to be enabled anyway. TLS certificate can be automatically generated by cert-manager.
<source>
@type forward
port 24224
bind 0.0.0.0
<transport tls>
cert_path /fluentd/certs/tls.crt
private_key_path /fluentd/certs/tls.key
</transport
<security>
self_hostname fluend-aggregator
shared_key s1cret0
</security>
</source>
1) Generate fluentd TLS certificate with certmanager using custom cluster CA.
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: fluentd-tls
namespace: k3s-logging
spec:
# Secret names are always required.
secretName: fluentd-tls
duration: 2160h # 90d
renewBefore: 360h # 15d
commonName: fluentd.picluster.ricsanfre.com
isCA: false
privateKey:
algorithm: ECDSA
size: 256
usages:
- server auth
- client auth
# At least one of a DNS Name, URI, or IP address is required.
dnsNames:
- fluentd.picluster.ricsanfre.com
# ClusterIssuer: ca-issuer.
issuerRef:
name: ca-issuer
kind: ClusterIssuer
group: cert-manager.io
Certmanager will create a TLS Secret:
apiVersion: v1
kind: Secret
metadata:
name: fluentd-tls
namespace: k3s-logging
data:
tls.crt: base64 encoded cert
tls.key: base64 encoded key
type: kubernetes.io/tls
2) That certificate can be mounted in fluentd pod as volume /fluentd/certs
```yml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: fluentd
name: fluentd
namespace: k3s-logging
spec:
replicas: 1
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
containers:
- image: "{{ efk_fluentd_aggregator_image }}"
imagePullPolicy: Always
name: fluentd
env:
# Elastic operator creates elastic service name with format cluster_name-es-http
- name: FLUENT_ELASTICSEARCH_HOST
value: efk-es-http
# Default elasticsearch default port
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
# Elasticsearch user
- name: FLUENT_ELASTICSEARCH_USER
value: "elastic"
# Elastic operator stores elastic user password in a secret
- name: FLUENT_ELASTICSEARCH_PASSWORD
valueFrom:
secretKeyRef:
name: "efk-es-elastic-user"
key: elastic
# Setting a index-prefix for fluentd. By default index is logstash
- name: FLUENT_ELASTICSEARCH_INDEX_NAME
value: fluentd
- name: FLUENT_ELASTICSEARCH_LOG_ES_400_REASON
value: "true"
ports:
- containerPort: 24224
name: forward
protocol: TCP
- containerPort: 24231
name: prometheus
protocol: TCP
volumeMounts:
- mountPath: /fluentd/etc
name: config
readOnly: true
- mountPath: "/fluentd/certs"
name: fluentd-tls
readOnly: true
volumes:
- configMap:
defaultMode: 420
name: fluentd-config
name: config
```
Fluentbit and fluentd filesystem buffering mechanisms should be enabled.
Fluentd aggregator should be deployed in HA, Kubernetes deployment with several replicas. Kubernetes HPA (Horizontal POD Autoscaler) could be configured to automatically scale the number of replicas.
Fluentd could be deployed as Statefulset instead of Deployment with dedicated pvc for disk buffer. This way if pod is terminated buffer information is not lost.
fluentd official helm chart also supports the deployment of fluentd as deployment or statefulset instead of daemonset. In case of deployment HPA is also supported.
values.yml could be something like this:
# Deploy fluentd as deployment
kind: "Deployment"
# Number of replicas
replicaCount: 1
# Enabling HPA
autoscaling:
enabled: true
minReplicas: 1
maxReplicas: 100
targetCPUUtilizationPercentage: 80
# Do not create serviceAccount, RBAC and podSecurityPolicy objects
serviceAccount:
create: false
rbac:
create: false
podSecurityPolicy:
enabled: false
## Additional environment variables to set for fluentd pods
env:
...
# Volumes and VolumeMounts (only configuration files and certificates)
volumes:
- name: etcfluentd-main
configMap:
name: fluentd-main
defaultMode: 0777
- name: etcfluentd-config
configMap:
name: fluentd-config
defaultMode: 0777
- name: fluentd-tls
secret:
secretName: fluentd-tls
volumeMounts:
- name: etcfluentd-main
mountPath: /etc/fluent
- name: etcfluentd-config
mountPath: /etc/fluent/config.d/
- mountPath: /fluentd/certs
name: fluentd-tls
readOnly: true
service:
type: "ClusterIP"
annotations: {}
# loadBalancerIP:
# externalTrafficPolicy: Local
ports:
- name: "forwarder"
protocol: TCP
containerPort: 24224
- name: prometheus
containerPort: 24231
protocol: TCP
## Fluentd list of plugins to install
##
plugins: []
# - fluent-plugin-out-http
## Add fluentd config files from K8s configMaps
##
configMapConfigs:
- fluentd-prometheus-conf
# - fluentd-systemd-conf
## Fluentd configurations:
##
fileConfigs:
01_sources.conf: |-
## logs from podman
<source>
@type tail
@id in_tail_container_logs
@label @KUBERNETES
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.log.pos
tag kubernetes.*
read_from_head true
<parse>
@type multi_format
<pattern>
format json
time_key time
time_type string
time_format "%Y-%m-%dT%H:%M:%S.%NZ"
keep_time_key false
</pattern>
<pattern>
format regexp
expression /^(?<time>.+) (?<stream>stdout|stderr)( (.))? (?<log>.*)$/
time_format '%Y-%m-%dT%H:%M:%S.%NZ'
keep_time_key false
</pattern>
</parse>
emit_unmatched_lines true
</source>
02_filters.conf: |-
<label @KUBERNETES>
<match kubernetes.var.log.containers.fluentd**>
@type relabel
@label @FLUENT_LOG
</match>
# <match kubernetes.var.log.containers.**_kube-system_**>
# @type null
# @id ignore_kube_system_logs
# </match>
<filter kubernetes.**>
@type kubernetes_metadata
@id filter_kube_metadata
skip_labels false
skip_container_metadata false
skip_namespace_metadata true
skip_master_url true
</filter>
<match **>
@type relabel
@label @DISPATCH
</match>
</label>
03_dispatch.conf: |-
<label @DISPATCH>
<filter **>
@type prometheus
<metric>
name fluentd_input_status_num_records_total
type counter
desc The total number of incoming records
<labels>
tag ${tag}
hostname ${hostname}
</labels>
</metric>
</filter>
<match **>
@type relabel
@label @OUTPUT
</match>
</label>
04_outputs.conf: |-
<label @OUTPUT>
<match **>
@type elasticsearch
host "elasticsearch-master"
port 9200
path ""
user elastic
password changeme
</match>
</label>
Enhancement Request
Add logs aggregation layer to logging architecture. From this layer logs can be aggregated, filtered and routed to different destinations to further processing (elasticsearch, kafka, s3, etc.)
source: Common architecture patterns with fluentd and fluentbit
Implementation details
Log aggregation layer can be based on fluentd or fluentbit. Both of them can be used as Log forwarders and Log aggregators fluentbit documentation. The difference is only in the number of plugins (input, output, etc) available.
Fluentbit does not support kafka input plugin (only output). Fluentd supports kafka integration as input and output. Fluentd should be the right choice for log aggregation layer, in case the logging architecture evolve in future to have a Kafka cluster as buffer mechanism between log forwarders and log aggregators.
source: One Year of Log Management at Vinted
Changes to the current logging architecture: