awslabs / collectd-cloudwatch

A collectd plugin for sending data to Amazon CloudWatch
MIT License
200 stars 132 forks source link

Max Retries Exception - Only One Data Point Recorded? #12

Closed scott-wood-vgh closed 8 years ago

scott-wood-vgh commented 8 years ago

I have configured the CloudWatch collectd plugin for Python to push some metrics to CloudWatch - specifically the DF (disk space) and load plugins. A single data point for each instance and metric is recorded when the service is started, but only one. After a period of time, the following repeated error appears in the collectd.log:

[AmazonCloudWatchPlugin][cloudwatch.modules.client.putclient] Could not put metric data using the following endpoint: 'https://monitoring.us-east-1.amazonaws.com/'. [Exception: HTTPSConnectionPool(host='monitoring.us-east-1.amazonaws.com', port=443): Max retries exceeded with url: /?Action=PutMetricData&MetricData.member.1.Dimensions.member.1.Name=Host&MetricData.member.1.Dimensions.member.1.Value=i-887de60e&MetricData.member.1.Dimensions.member.2.Name=PluginInstance&MetricData.member.1.Dimensions.member.2.Value=var-log-tomcat7&MetricData.member.1.MetricName=df.percent_bytes.used&MetricData.member.1.StatisticValues.Maximum=16.3765602112&MetricData.member.1.StatisticValues.Minimum=16.3764781952&MetricData.member.1.StatisticValues.SampleCount=6&MetricData.member.1.StatisticValues.Sum=98.2591972351&MetricData.member.1.Timestamp=20161013T181708Z&MetricData.member.2.Dimensions.member.1.Name=Host&MetricData.member.2.Dimensions.member.1.Value=i-887de60e&MetricData.member.2.Dimensions.member.2.Name=PluginInstance&Me

It appears the request is being rate limited, which is not something I noticed when testing successfully on a smaller research setup - perhaps a configuration option could be adjusted? Is there a way to mitigate this error so I can have continuously flowing metric data? Please advise.

scott-wood-vgh commented 8 years ago

I fixed this! Turns out my WriteQueue default in collectd.conf was the issue. I uncommented to use the suggested defaults of 1000000 and 800000 and it worked!