Open thearifismail opened 5 years ago
Looks like the problem is in the host and not in Fluent Bit. Did you check if your Node is under memory pressure?
apiVersion: extensions/v1beta1 kind: DaemonSet metadata: name: fluent-bit namespace: logging labels: k8s-app: fluent-bit-logging version: v1 kubernetes.io/cluster-service: "true" spec: template: metadata: labels: k8s-app: fluent-bit-logging version: v1 kubernetes.io/cluster-service: "true" annotations: prometheus.io/scrape: "true" prometheus.io/port: "2020" prometheus.io/path: /api/v1/metrics/prometheus spec: containers:
apiVersion: v1 kind: ConfigMap metadata: name: fluent-bit-config namespace: logging labels: k8s-app: fluent-bit data:
fluent-bit.conf: | [SERVICE] Flush 1 Log_Level info Daemon off Parsers_File parsers.conf HTTP_Server On HTTP_Listen 0.0.0.0 HTTP_Port 2020
@INCLUDE input-kubernetes.conf
@INCLUDE filter-kubernetes.conf
@INCLUDE output-elasticsearch.conf
input-kubernetes.conf: | [INPUT] Name tail Tag kube.*
Path /data/docker/containers/*/*.log
Parser docker
DB /var/log/flb_kube.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10
filter-kubernetes.conf: | [FILTER] Name kubernetes Match kube.* Kube_URL https://kubernetes.default.svc.cluster.local:443 Merge_Log On K8S-Logging.Parser On
output-elasticsearch.conf: | [OUTPUT] Name es Match * Host ${FLUENT_ELASTICSEARCH_HOST} Port ${FLUENT_ELASTICSEARCH_PORT} Logstash_Format Off Retry_Limit False Index fluent-bit
parsers.conf: |
[PARSER]
Name apache
Format regex
Regex ^(?[^ ]) (?
[PARSER]
Name apache2
Format regex
Regex ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name apache_error
Format regex
Regex ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$
[PARSER]
Name nginx
Format regex
Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name json
Format json
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
# Command | Decoder | Field | Optional Action
# =============|==================|=================
Decode_Field_As escaped log
[PARSER]
Name syslog
Format regex
Regex ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
Time_Key time
Time_Format %b %d %H:%M:%S
With the above daemonset, I have started getting logs but there are a lot of entries like this: { "_index": "fluent-bit", "_type": "flb_type", "_id": "8CZFqGcBeFDOdcHK3wlB", "_version": 1, "_score": null, "_source": { "@timestamp": "2018-12-13T15:53:30.999Z", "log": "[2018/12/13 15:53:30] [ warn] [filter_kube] invalid pattern for given tag kube.data.docker.containers.e21ddf07416d5cf36cdde9b05b9efffb163e7b43f87cb55c87a0ae470c932757.e21ddf07416d5cf36cdde9b05b9efffb163e7b43f87cb55c87a0ae470c932757-json.log\n", "stream": "stderr", "time": "2018-12-13T15:53:30.999064873Z" }, "fields": { "@timestamp": [ "2018-12-13T15:53:30.999Z" ], "time": [ "2018-12-13T15:53:30.999Z" ] }, "sort": [ 1544716410999 ] }
Should I file a separate issue for it or it is related to the current and I should keep both in this thread?
so , any update ?
Do you find any solution ? I have the same probelme
Need help with how to locate the problem blocking logs push by fluent-bit containers to elasticsearch.
This setup works fine without any problems in one environment but does not in our staging environment, where it must succeed before moving to the production environment.
Setup
Kubernetes v1.11 (installed using RKE CLI with controlplan, etcd, and workers on separate nodes) Elasticsearch v6.4.3 native install Fluent-bit image: fluent/fluent-bit:0.14.6 Kibana v.6.4.2
The elasticsearch host is accessible from every node in the problem cluster. Fluent-bit containers can read logs but what happens after that is a mystery. Here is the docker log from one of the nodes:
docker logs 54b2ed96ca7f Fluent-Bit v0.14.6 Copyright (C) Treasure Data
[2018/12/07 22:15:28] [ info] [engine] started (pid=1) [2018/12/07 22:15:28] [ info] [filter_kube] https=1 host=kubernetes.default.svc.cluster.local port=443 [2018/12/07 22:15:28] [ info] [filter_kube] local POD info OK [2018/12/07 22:15:28] [ info] [filter_kube] testing connectivity with API server... [2018/12/07 22:15:28] [ info] [filter_kube] API server connectivity OK [2018/12/07 22:15:28] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
I don't know if it has any bearing, but I don't have permission on the system to check if port 2020 is available or not.
The /var/log/messages in the fluent-bit container on a node is flooded with messages like the following:
kernel: ipmi-sensors:61430 map pfn expected mapping type uncached-minus for [mem 0xbfee0000-0xbfee0fff], got write-back
Dec 7 22:44:37 , dockerd: time="2018-12-07T22:44:37.062465721Z" level=error msg="Error running exec in container: OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused \"exec: \\"bash\\": executable file not found in $PATH\": unknown"