splunk / splunk-connect-for-kubernetes

Helm charts associated with kubernetes plug-ins
Apache License 2.0
343 stars 270 forks source link

Getting timeout flush error in fluend logs #346

Closed Prakashreddy134 closed 3 years ago

Prakashreddy134 commented 4 years ago

Hi,

i am getting below timeout flush error how can we fix it.

2020-03-16 08:10:48 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers

rockb1017 commented 4 years ago

it says it is [info] not error.

Prakashreddy134 commented 4 years ago

yes @rockb1017 I see error with info level 2020-03-17 04:46:58 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers

rockb1017 commented 4 years ago

are you saying it is actual error logs? could you post more logs ?

Prakashreddy134 commented 4 years ago

please find below splunk pod logs

2020-03-17 09:56:38 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers. 2020-03-17 09:56:48 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers. 2020-03-17 09:56:58 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers. 2020-03-17 09:57:08 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers. 2020-03-17 09:57:14 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers 2020-03-17 09:57:18 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers. 2020-03-17 09:57:28 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers. 2020-03-17 09:57:30 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers. 2020-03-17 09:57:38 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers. 2020-03-17 09:57:48 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers. 2020-03-17 09:57:58 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers. 2020-03-17 09:58:08 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers. 2020-03-17 09:58:18 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers. 2020-03-17 09:58:28 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers. 2020-03-17 09:58:38 +0000 [info]: #0 Timeout flush: tail.containers.var.log.containers.

rockb1017 commented 4 years ago

from my search, this error seem to a bug from concat plugin. https://github.com/fluent-plugins-nursery/fluent-plugin-concat/issues/4#issuecomment-238085219

rockb1017 commented 4 years ago

are all container logs not getting ingested to Splunk?

Prakashreddy134 commented 4 years ago

Yes for all splunk pods i see same error timeout flush for all containers

i have introduced multiline code that includes concat plugin into configmap i believe because of this i am seeing the most recent event is not getting flushed until next event occurs.Like an event occured 6.30 am and that event wont get reflected in splunk until and unless if next event occurs like suppose if another event occurs 6.40 am and 6.30 am will get reflected in splunk How can we fix this? Because if i remove my mutliline code its seems to be working like most recent event are reflecting in splunk

Prakashreddy134 commented 4 years ago

this issue similar to "Events not flushing until next event occurs because of multine code block #324" and "Multiline events not flushing until next event occurs #243" and i dont see any solution for this issues and they are still open. can you help let us know how to fix issue?

matthewmodestino commented 4 years ago

timeout flushes are normal, they mean the chunk hasnt filled and is instead being sent based on time. You need to tune your environment for the level of traffic you have. What cluster are you working in when you see this? a personal test cluster or an actual in use cluster?