Azure / iotedge

The IoT Edge OSS project
MIT License
1.45k stars 458 forks source link

[MetricsCollector] - BlockedMetrics option in azureiotedge-metrics-collector:1.0.2 not working #6133

Closed adithyaamara closed 2 years ago

adithyaamara commented 2 years ago

Expected Behavior

As per the docs on azure metrics collector, if one provides to container this as env variable BlockedMetrics=request_per_second_created{}[http://anyScrapingTarget:9000] , It should stop scraping the specified metric from the specified target.

Current Behavior

Module is scraping the same number of metrics before and after the addition of the above env variable (198 in my case). metrics collector Logs says : Scraping finished, received 198 metrics from endpoint http://anyScrapingTarget:9000

Device Information

Runtime Versions

Additional Information

I initially suspected myself if was using wrong syntax for BlockedMetrics variable: Here are various combinations of BlockedMetrics i have tried :

Thank you.

nlcamp commented 2 years ago

@adithyaamara - We verified that BlockedMetrics are working in v1.0.2 of the MetricsCollector. From your description it seems like you're expecting the number of scraped metrics to decrease, but that number does not change. The number of metrics uploaded is what changes.

For example, if you set BlockedMetrics = edgeAgent_cpu_percent, you might see the following message about the number of metrics scraped:

[2022-02-18 20:10:33.251 INF] Scraping endpoint http://edgeAgent:9600/metrics [2022-02-18 20:10:33.372 INF] Scraping endpoint http://edgeHub:9600/metrics [2022-02-18 20:10:33.558 INF] Scraping finished, received 25 metrics from endpoint http://edgeHub:9600/metrics [2022-02-18 20:10:33.562 INF] Scraping finished, received 97 metrics from endpoint http://edgeAgent:9600/metrics

That shows 122 metrics scraped, however lower down in the log, you'll see the following which shows that only 106 of the 122 metrics are actually sent to azure monitor:

[2022-02-18 20:10:38.768 INF] Successfully sent 106 metrics to fixed set table [2022-02-18 20:10:38.769 INF] Successfully completed periodic operation Scrape and Upload Metrics

If we're misunderstanding your setup, please let us know. Sharing your metrics collector logs (with any sensitive information redacted) would help us further investigate.

adithyaamara commented 2 years ago

@nlcamp , Thank you for your response.

  1. We don't see any log like : Successfully sent 106 metrics to fixed set table. [Please find logs below]
  2. Taking the metric as request_per_second_created to be excluded on target11, To rule out the possibility that I may be using wrong syntax for BlockedMetrics , Could you please let me know the value of BlockedMetrics to be used.

    Note : We are using a nested edge setup and I am trying to apply this BlockedMetrics in metrics collector on child device. Do I need to consider anything extra in this kind of Edge setup?

    • Here are the logs of metrics collector for one periodic run: (Actual service names are represented as targets)
      [2022-02-19 08:10:37.405 INF] Starting periodic operation Scrape and Upload Metrics...
      [2022-02-19 08:10:37.405 INF] Scraping endpoint http://target1:9600/metrics
      [2022-02-19 08:10:37.405 INF] Scraping endpoint http://target2:9600/metrics
      [2022-02-19 08:10:37.406 INF] Scraping endpoint http://target3:9600/metrics
      [2022-02-19 08:10:37.406 INF] Scraping endpoint http://target4:9600/metrics
      [2022-02-19 08:10:37.406 INF] Scraping endpoint http://target5:9600/metrics
      [2022-02-19 08:10:37.406 INF] Scraping endpoint http://target6:9600/metrics
      [2022-02-19 08:10:37.406 INF] Scraping endpoint http://target7:9600/metrics
      [2022-02-19 08:10:37.406 INF] Scraping endpoint http://target8:9600/metrics
      [2022-02-19 08:10:37.406 INF] Scraping endpoint http://target9:9600
      [2022-02-19 08:10:37.407 INF] Scraping endpoint http://target10:9600/metrics
      [2022-02-19 08:10:37.407 INF] Scraping endpoint http://target11:9000
      [2022-02-19 08:10:37.451 INF] Scraping finished, received 11 metrics from endpoint http://target2:9600/metrics
      [2022-02-19 08:10:37.456 INF] Scraping finished, received 14 metrics from endpoint http://target5:9600/metrics
      [2022-02-19 08:10:37.456 INF] Scraping finished, received 18 metrics from endpoint http://target6:9600/metrics
      [2022-02-19 08:10:37.458 INF] Scraping finished, received 17 metrics from endpoint http://target9:9600
      [2022-02-19 08:10:37.458 INF] Scraping finished, received 17 metrics from endpoint http://target4:9600/metrics
      [2022-02-19 08:10:37.459 INF] Scraping finished, received 15 metrics from endpoint http://target3:9600/metrics
      [2022-02-19 08:10:37.462 INF] Scraping finished, received 14 metrics from endpoint http://target8:9600/metrics
      [2022-02-19 08:10:37.463 INF] Scraping finished, received 27 metrics from endpoint http://target7:9600/metrics
      [2022-02-19 08:10:37.468 INF] Scraping finished, received 248 metrics from endpoint http://target1:9600/metrics
      [2022-02-19 08:10:37.505 INF] Scraping finished, received 198 metrics from endpoint http://target11:9000
      [2022-02-19 08:10:37.506 INF] Scraping finished, received 22 metrics from endpoint http://target10:9600/metrics
      [2022-02-19 08:10:37.541 INF] Successfully sent metrics via IoT message
      [2022-02-19 08:10:37.541 INF] Successfully completed periodic operation Scrape and Upload Metrics

      Thank you.

nlcamp commented 2 years ago

@adithyaamara Ok, I see you have the UploadTarget env var set to 'IoTMessage'. In that mode the metrics collector unfortunately does not log the number of metrics being uploaded.

For your use case, just set BlockedMetrics to 'request_per_second_created'. That will block all metrics with that name, regardless of tag values.

Have you configured a cloud workflow to deliver the collected metrics to Log Analytics? If so, I'd suggest doing a query in your Log Analytics workspace to confirm that the blocked metrics are not being uploaded. The following sample has instructions for how to deliver metrics to Log Analytics when using the 'IoTMessage' UploadTarget setting: https://github.com/Azure-Samples/iotedge-logging-and-monitoring-solution#monitoring-architecture-reference

adithyaamara commented 2 years ago

@nlcamp . Thank you very much.

As per your suggestion, we queried log analytics, by that we confirmed that metrics declared in BlockedMetrics are not being uploaded.

Experimented if it works well for multiple blocked metrics also using BlockedMetrics=request_per_second_created,successful_requests_per_second_created which worked as expected.

Since it is hard to explicitly manually mention all _created metrics to be excluded like above, Just wanted to know if it is possible to Regex match metrics names itself (Not the labels inside a single metric).

Example : *BlockedMetrics=_created should block any metric name that ends with _created.**

Thanks for the Support.

nlcamp commented 2 years ago

@adithyaamara - Great! Glad it's working for you. Regarding your question about using a regex to match metrics names, there is partial support for that. Your particular example should work. The following snippet is from our public doc titled Collect and transport metrics:

Wildcards (any characters) and ? (any single character) can be used in metric names. For example, CPU would match maxCPU and minCPU but not CPUMaximum. ???CPU would match maxCPU and minCPU but not maximumCPU. This component is required in a metrics selector.