prometheus / cloudwatch_exporter

Metrics exporter for Amazon AWS CloudWatch
Apache License 2.0
906 stars 326 forks source link

[metrics]: elasticache redis with tags #748

Open jurgenweber opened 1 month ago

jurgenweber commented 1 month ago

Context information

Exporter configuration ```yaml metrics: - aws_dimensions: - CacheNodeId - CacheClusterId aws_metric_name: EngineCPUUtilization aws_namespace: AWS/ElastiCache aws_statistics: - Average aws_tag_select: tag_selections: Environemnt: ${cluster.name} resource_type_selection: 'elasticache:replication-group' resource_id_dimension: CacheClusterId ```
Exporter logs ```log cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter WARNING: CloudWatch scrape failed cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter java.lang.ClassCastException: class java.lang.String cannot be cast to class java.util.Collection (java.lang.String and java.util.Collection are in module java.base of loader 'bootstrap') cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at io.prometheus.cloudwatch.CloudWatchCollector.getResourceTagMappings(CloudWatchCollector.java:373) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at io.prometheus.cloudwatch.CloudWatchCollector.scrape(CloudWatchCollector.java:479) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at io.prometheus.cloudwatch.CloudWatchCollector.collect(CloudWatchCollector.java:642) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at io.prometheus.client.Collector.collect(Collector.java:45) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:204) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:219) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:152) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at io.prometheus.client.exporter.common.TextFormat.writeOpenMetrics100(TextFormat.java:202) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at io.prometheus.client.exporter.common.TextFormat.writeFormat(TextFormat.java:57) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at io.prometheus.client.servlet.common.exporter.Exporter.doGet(Exporter.java:75) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at io.prometheus.client.servlet.jakarta.exporter.MetricsServlet.doGet(MetricsServlet.java:52) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:500) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:587) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:764) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:529) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1381) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1303) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.server.Server.handle(Server.java:563) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:421) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:390) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:277) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:199) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149) cloudwatch-exporter-79884fb5d7-4dbmv prometheus-cloudwatch-exporter at java.base/java.lang.Thread.run(Unknown Source) ```

What do you expect to happen?

I expect to see metrics for the redis instances that have the tag

What happened instead?

I get that error in a loop

Other comments

I have tried many iterations of the config to work out what it needs to be, if I removed the tag select it works fine. examples include;

              resource_type_selection: 'elasticache:replicationgroup'
              resource_id_dimension: ReplicationGroup

Or

              resource_type_selection: 'elasticache:replicationgroup'
              resource_id_dimension: CacheNodeId

and every iteration of the above

jurgenweber commented 1 month ago

so I have worked out that

          - aws_dimensions:
              - CacheNodeId
              - CacheClusterId
            aws_metric_name: EngineCPUUtilization
            aws_namespace: AWS/ElastiCache
            aws_statistics:
              - Average
            aws_tag_select:
              tag_selections:
                Environemnt: 
                  - ${cluster.name}
              resource_type_selection: 'elasticache:replication-group'
              resource_id_dimension: CacheClusterId    

The tag value, needs to be a list.. So that is the error, but even with this configuration. I don't get any metrics.

matthiasr commented 1 month ago

I think you also have a typo there, or is your tag actually Environemnt?

On Mon, 21 Oct 2024, 07:19 Jürgen W, @.***> wrote:

so I have worked out that

      - aws_dimensions:
          - CacheNodeId
          - CacheClusterId
        aws_metric_name: EngineCPUUtilization
        aws_namespace: AWS/ElastiCache
        aws_statistics:
          - Average
        aws_tag_select:
          tag_selections:
            Environemnt:
              - ${cluster.name}
          resource_type_selection: 'elasticache:replication-group'
          resource_id_dimension: CacheClusterId

The tag value, needs to be a list.. So that is the error, but even with this configuration. I don't get any metrics.

— Reply to this email directly, view it on GitHub https://github.com/prometheus/cloudwatch_exporter/issues/748#issuecomment-2425612487, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABAEBSMRQ6BUH3Z6JQPMVLZ4SFH5AVCNFSM6AAAAABPGYHNJKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRVGYYTENBYG4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jurgenweber commented 1 month ago

argh, n1..... That found, it still doesn't work. :\

jurgenweber commented 1 month ago

Ok, looking into this more, it seems the 'resource_type_selection' is 'replicationgroup', not 'replication-group'. As I have a few of these runnings and the former returns this metric;

aws_resource_info{job="aws_elasticache",instance="",arn="arn:aws:elasticache:us-east-1:512550242293:replicationgroup........ and it has all of the instances with the Environment tag.

But, I do not have the EngineCPUUtilization metric in the endpoint.

jurgenweber commented 1 month ago

So tried changing the resource_id_dimension; without succes. So this is what I have right now:

          - aws_dimensions:
              - CacheNodeId
              - CacheClusterId
            aws_metric_name: EngineCPUUtilization
            aws_namespace: AWS/ElastiCache
            aws_statistics:
              - Average
            aws_tag_select:
              tag_selections:
                Environment:
                  - myenvironmentname
              resource_type_selection: elasticache:replicationgroup 
              resource_id_dimension: CacheNodeId 
jurgenweber commented 1 month ago

I don't know if this is a bug or not, but I changed source _id_dimension to 'CacheClusterId,CacheNodeId'.

the metric; 'aws_resource_info' shows just the taged instances, but the metric 'aws_elasticache_engine_cpuutilization_average' shows/has metrics for instances that do not have the correct Environment tag value, but it is finally there now.