Closed timberhill closed 2 years ago
Encountered this same issue with the latest version (1.4.15) of splunk-connect-for-kubernetes:
│ Labels: app=splunk-kubernetes-logging
│ chart=splunk-kubernetes-logging-1.4.15
│ splunk-fluentd-k8s-logs:
│ Image: docker.io/splunk/fluentd-hec:1.2.13
This issue is causing our log files to grow massively. As a workaround, since these logs appear as info, I set the logging level to warn in our values.yaml
to workaround this issue:
splunk-kubernetes-logging:
charEncodingUtf8: true
logLevel: warn
Example log from app:
2022/05/20 16:52:11 <redacted>"GET /manager/html HTTP/1.0" 200 6 "" 9.7µs
2022/05/20 16:52:11 <redacted> "GET /api/nsstats HTTP/1.0" 200 6 "" 23.9µs
2022/05/20 16:52:11 <redacted> "GET /ø®ø®/ø®ø®/ø®ø®/ø®ø®/ø®ø®/ø®ø®/ø®ø®/ø®ø®/ø®ø®/ø®ø®/ø®ø®/ø®ø®/etc/passwd HTTP/1.0" 200 6 "" 7.9µs
My error log (replaced (``) with ('') for formatting purposes)
2022-05-20 16:52:59 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:58 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:57 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:56 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:55 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:54 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:53 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:52 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:51 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:50 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:49 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:48 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:47 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:46 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:45 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:44 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:43 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:42 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:41 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:40 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:39 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:38 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:37 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:36 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:35 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:34 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:33 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:32 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:31 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:30 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:29 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:28 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:27 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:26 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:25 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:24 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:23 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:22 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:21 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:20 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:19 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:18 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:17 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:16 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:15 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:14 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:13 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:12 +0000 [info]: #0 invalid byte sequence is replaced in '2022-05-20 16:52:11 +0000 [info]: #0 invalid byte sequence is replaced in '2022/05/20 16:52:11 <redacted> "GET / / / / / / / / / / / / /etc/passwd HTTP/1.0" 200 6 "" 7.9µs'''''''''''''''''''''''''''''''''''''''''''''''''
Describe the bug
When trying to process a string that contains an invalid byte sequence,
fluentd
logs out an info level line that contains the original problematic string. If it is also handling its own log forwarding, this leads to recursive logging.To Reproduce
Produce an application log with a non-UTF-8 character
a. Fluentd picks it up as a part of SCK and produces the
invalid byte sequence is replaced
log lineb. That log line is picked up bu fluentd.
c. Repeats indefinitely
Expected behavior
The
invalid byte sequence is replaced
log line is produced once.Your Environment
This is run as a part of Splunk Connect for Kubernetes v1.4.10
Your Configuration
Default helm installation for Splunk connect for Kubernetes.
ConfigMap from CSK: https://github.com/splunk/splunk-connect-for-kubernetes/blob/1.4.10/manifests/splunk-kubernetes-logging/configMap.yaml
Your Error Log
Additional context
The function that seems to be producing the issue is
string_safe_encoding
. The log line is produced before the invalid character is replaced in the string.I am also raising a PR with the simple fix.