Closed seedifferently closed 1 year ago
1.7 upgraded the dogstatsd client. Is it possible you need to upgrade the agent? Exactly what software and version is the agent?
Our logs say we're running AmazonCloudWatchAgent 1.247359.0
which seems to be latest(ish): https://github.com/aws/amazon-cloudwatch-agent/releases
We do a docker pull public.ecr.aws/cloudwatch-agent/cloudwatch-agent:latest
with each redeploy to stay current with their releases.
I guess I don’t understand the error, is that line malformed? If so, how?
This is getting beyond my areas of expertise, but the cloudwatch agent statsd docs say:
CloudWatch supports the following StatsD format:
MetricName:value|type|@sample_rate|#tag1:value...
However, I'm now also seeing that the DogStatsD protocol v1.2 added a new container ID
field.
Looking at another example from our logs, my guess is the CW Agent's lack of support for this field is what's tripping it up:
2023-05-18T18:13:07Z E! Error: parsing sample rate, , it must be in format like: @0.1, @0.5, etc. Ignoring sample rate for line: faktory.throttle.lock:10|c|c:96eba0d873fa4c2ea751083e66d99e77-2157570088
Assuming that's a solid guess, I'll see if there's anything that I can do on my end to get the CW Agent to handle that better -- but might there be a way to disable utilization of that field on Faktory's side?
I would open an issue with the Cloudwatch agent repo/team and ask what their advice is. It's possible a lot of people are having this same problem, not just with Faktory. Will they support proprietary protocol extensions like this, or at least have their stream reader parse and discard the extra data?
You can try starting Faktory with DD_ORIGIN_DETECTION_ENABLED=false
and see if that disables reporting the containerId.
It looks like DD_ORIGIN_DETECTION_ENABLED=false
does the trick! Thank you for your assistance.
We just upgraded to Faktory Enterprise v1.7.0 (from the contribsys docker image), and our Cloudwatch statsd collection agent started logging a constant slew of parsing errors.
Our
statsd.toml
looks like:Sample of logs from statsd agent:
No errors before we upgraded from v1.6.2 to v1.7.0.
Does v1.7.0 need additional settings in our
statsd.toml
to maintain the v1.6.2 behavior?