splunk / docker-logging-plugin

Splunk Connect for Docker is a Docker logging plugin that allows docker containers to send their logs directly to Splunk Enterprise or a Splunk Cloud deployment.
Apache License 2.0
65 stars 25 forks source link

Delay in sending logs to Splunk #46

Closed cwaldbieser closed 6 years ago

cwaldbieser commented 6 years ago

I am using the Splunk logging driver to send logs to splunk with the following command line: docker run -d -p 443:8443 --log-driver=splunk --log-opt splunk-token=REDACTED --log-opt splunk-url=https://myloghost.example.net:8088 --log-opt splunk-sourcetype=idp --log-opt splunk-index=auth_idp --log-opt splunk-insecureskipverify=1 --log-opt splunk-format=raw --log-opt splunk-gzip=true --name shib --restart always --health-cmd 'curl -k -f https://127.0.0.1:8443/idp/status || exit 1' --health-interval=2m --health-timeout=30s

The container runs normally, and logs flow into Splunk. All is good. This is in a testing environment, so it is not always in use, but the container is left running. Sometimes, when I start using the service the container provides, nothing is logged to Splunk immediately. If I wait 10-15 minutes, the logs eventually show up with the correct time stamps, etc.

I've noticed on the docker host that netstat -tpn | grep -e 8088 gives me output similar to this:

Active Internet connections (w/o servers)
Proto Recv-Q    Send-Q  Local Address           Foreign Address         State       PID/Program name    
tcp        0    947     xxx.xxx.x.xxx:49010     xxx.xxx.x.xx:8088       ESTABLISHED 12682/dockerd-curre   

On the Splunk host, the same command shows zeroes in the Recv-Q and Send-Q columns. The Splunk Distributed Management Console doesn't show any events received during the lag time. On the Docker host, there is a message in /var/log/messages from Docker that happens at the same time the logs are finally sent to Splunk:

Jul  6 13:14:19 idpdock0-0 dockerd-current: time="2018-07-06T13:14:19.428396282-04:00" level=error msg="Post https://myloghost.example.net:8088/services/collector/event/1.0: read tcp xxx.xxx.x.xxx:49010->xxx.xxx.x.xx:8088: read: connection timed out"

It seems to me like the logging driver get stuck trying to do some I/O operation, and when it finally times out, it tries again and the logs are sent. However, I have no idea what the condition that causes it to get stuck is, nor do I know of any way to adjust the time out period.

cwaldbieser commented 6 years ago

Also, if this is not the actual project for the Docker-Splunk logging driver, I apologize, and you can close this issue.

dbaldwin-splunk commented 6 years ago

@cwaldbieser Docker logging drivers can be found at https://github.com/moby/moby.

This is Splunk Docker logging plug-in (Splunk Connect for Docker) which is a replacement for the logging driver. We have found that it is more stable and has greater scale and a better solution for most customers. Also is supported by Splunk assuming there is an active support contract in place.

dtregonning commented 6 years ago

Closing - Sorry @cwaldbieser as David mentioned this project is for the Splunk Connect for Docker not the Splunk Logging Driver. Please reach out if you have any issues getting the SPlunk logging plugin up and going.