Technofy / cloudwatch_exporter

A CloudWatch exporter for Prometheus coded in Go, with multi-region support
Apache License 2.0
78 stars 140 forks source link

Dimension labels are mixed up #10

Open tul opened 6 years ago

tul commented 6 years ago

If a config like the following is used:

  metrics:
      - aws_namespace: 'AWS/ECS'
        aws_metric_name: 'MemoryUtilization'
        aws_dimensions: [ 'ClusterName', 'ServiceName' ]
        aws_statistics: [ 'Maximum' ]
        aws_dimensions_select_regex:
            ServiceName: 'bot-.*'
            ClusterName: 'us-east-1-prod-public'

Then it produces metrics with the "service_name" and "cluster_name" values swapped over, e.g. :

aws_ecs_memory_utilization_maximum{cluster_name="bot-blah",service_name="us-east-1-prod-public",task="bots-dev"} 8.69140625

If the aws_dimensions entries are swapped around, to be [ 'ServiceName', 'ClusterName' ] then the dimensions become correctly labelled.

aws_ecs_memory_utilization_maximum{cluster_name="us-east-1-prod-public",service_name="bot-blah",task="bots-dev"} 8.69140625

tokyowizard commented 6 years ago

To add, the wrong value is assigned to the wrong label with some metrics, but will return the correct values with the correct labels with a different metric in the same namespace, AWS/ApplicationELB.

Wrong values with labels:

With the following config:

      - aws_namespace: "AWS/ApplicationELB"
        aws_metric_name: HTTPCode_Target_2XX_Count
        aws_statistics: ["Sum"]
        aws_dimensions: [TargetGroup, LoadBalancer]
        aws_dimensions_select_regex:
          TargetGroup: .*
          LoadBalancer: app/production-.*

The metric returned:

aws_application_elb_http_code_target_2xx_count_sum{
  asg="REDACTED",
  exported_task="alb",
  hostname="REDACTED",
  instance="REDACTED:9042",
  instance_type="REDACTED",
  load_balancer="ap-northeast-1a",
  region="ap-northeast-1",
  role="REDACTED",
  target_group="app/production-REDACTED",
  task="alb"
}

The metric incorrectly returns:

Correct values with labels:

The config below is the same as the config as above, but only the aws_metric_name and aws_statistics differ.

      - aws_namespace: "AWS/ApplicationELB"
        aws_metric_name: HealthyHostCount
        aws_statistics: ["Average"]
        aws_dimensions: [TargetGroup, LoadBalancer]
        aws_dimensions_select_regex:
          TargetGroup: .*
          LoadBalancer: app/production-.*

Correct values with labels are returned:

aws_application_elb_healthy_host_count_average{
  asg="REDACTED",
  exported_task="alb",
  hostname="REDACTED",
  instance="REDACTED:9042",
  instance_type="REDACTED",
  load_balancer="app/production-REDACTED",
  region="ap-northeast-1",
  role="REDACTED",
  target_group="targetgroup/REDACTED",
  task="alb"
}

Both HTTPCode_Target_2XX_Count and HealthyHostCount metrics are in the same namespace. https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/elb-metricscollected.html#load-balancer-metrics-alb

ghost commented 6 years ago

@tul @tokyowizard I pushed a new branch named fix_label_swaping, which contains a fix for the issue, maybe you can pull it and build it and report if it solves the issue for you as well. Best regards!

tokyowizard commented 6 years ago

@n17h31sm I had to get monitoring setup and move on to other projects. To do so, I used the "official" exporter despite that it's java based. Sorry, I won't be able to test because of that.

tul commented 5 years ago

@n17h31sm sorry for the long delay - yes the fix_label_swaping branch fixes the issue

alexteldekov commented 4 years ago

I tried to run a docker image using the fix_label_swaping branch, but got lots of

MalformedInput: timestamp must follow ISO8601
    status code: 400, request id: 68710424-5fdd-4493-a1ba-5df97bfa6973

messages and no metrics at http://cloudwatch-exporter:9042/scrape. The technofy/cloudwatch_exporter:latest is working but it swaps metrics sometimes... Can I fix the timestamp must follow ISO8601 somehow?