advantageous / systemd-cloud-watch

Alt util to AWS cloudwatch agent that works w/ systemd journal and sends the data in batches to AWS cloudwatch.
https://advantageous.github.io/systemd-cloud-watch/
90 stars 36 forks source link

Log spam on Ubuntu 16.04 after a couple minutes #17

Open ryanwitt opened 6 years ago

ryanwitt commented 6 years ago

I've been testing this tool out to use in production and came across what looks like an annoying bug in the latest released binary (v0.2.1). Several minutes into a run, the worker starts spewing logs at ~12KiB/s:

main INFO: 2017/10/25 02:53:22 config.go:125: Loading config config.conf
main INFO: 2017/10/25 02:53:22 creators.go:31: Creating repeater that is conneting to AWS cloud watch
aws INFO: 2017/10/25 02:53:22 aws.go:79: LogStreamName was not set so using cloudwatch log test-172-31-5-100-us-east-1b 
main INFO: 2017/10/25 02:53:22 workers.go:203: Success: Seek to head of systemd journal
main INFO: 2017/10/25 02:55:23 workers.go:86: Systemd CloudWatch: batches sent 1, idleCount 10261,  emptyCount 10263
main INFO: 2017/10/25 02:55:23 workers.go:86: Systemd CloudWatch: batches sent 1, idleCount 10262,  emptyCount 10264
main INFO: 2017/10/25 02:55:23 workers.go:86: Systemd CloudWatch: batches sent 1, idleCount 10263,  emptyCount 10265
main INFO: 2017/10/25 02:55:23 workers.go:86: Systemd CloudWatch: batches sent 1, idleCount 10264,  emptyCount 10266
main INFO: 2017/10/25 02:55:23 workers.go:86: Systemd CloudWatch: batches sent 1, idleCount 10265,  emptyCount 10267
main INFO: 2017/10/25 02:55:23 workers.go:86: Systemd CloudWatch: batches sent 1, idleCount 10266,  emptyCount 10268
...

Fortunately, I don't think these correspond to CloudWatch API calls, since I don't see them in the network traffic while these messages are being printed. However, if the logger is set up inside a systemd unit, this could cause lots of feedback spam to CloudWatch logs. 🌊

The batches sent number seems to me like it might reflect the number of journal entries since the logger started.

Environment: Ubuntu 16.04 LTS

$ uname -a
Linux ip-172-31-5-100 4.4.0-1022-aws #31-Ubuntu SMP Tue Jun 27 11:27:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Installation:

curl -OL https://github.com/advantageous/systemd-cloud-watch/releases/download/v0.2.1/systemd-cloud-watch_linux

chmod 0755 systemd-cloud-watch_linux
cat > config.conf < EOF
log_group = "cloudwatch-test"
EOF

nohup ./systemd-cloud-watch_linux -f config.conf

IAM Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstances"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents",
                "logs:DescribeLogStreams"
            ],
            "Resource": [
                "arn:aws:logs:*:*:log-group:*",
                "arn:aws:logs:*:*:log-group:*:log-stream:*"
            ]
        }
    ]
}

Any ideas on what this could be before I start learning go?

ryanwitt commented 6 years ago

actually fixed in https://github.com/advantageous/systemd-cloud-watch/pull/16

donovan commented 6 years ago

Can confirm this issue occurs on Ubuntu 18.04 LTS with v0.2.1.

donovan commented 6 years ago

I tried reverting to v0.2.0 to avoid the issue but its still there unfortunately.

donovan commented 6 years ago

I am working around this issue by adding StandardOutput=null in the [Service] of my unit file /lib/systemd/system/journald-cloudwatch-logs.service

iamakulov commented 4 years ago

I’m working around the issue by piping the output through grep -v. E.g.:

# journald-cloudwatch-logs.service
- ExecStart=/usr/local/bin/journald-cloudwatch-logs /usr/local/etc/journald-cloudwatch-logs.conf
+ ExecStart=/bin/sh -c "/usr/local/bin/journald-cloudwatch-logs /usr/local/etc/journald-cloudwatch-logs.conf | grep --line-buffered -v 'Systemd CloudWatch: batches sent'"

Unlike @donovan’s approach, this allows the binary to log other entries – which can be useful if it e.g. logs an error.

--line-buffered in grep is required because otherwise grep will hold off logging for a while, and you won’t see any logs. (This is different from when logging into stdout.)