Open mtarantino opened 3 years ago
Thanks for reporting. Is it possible to add screenshot of the metrics explorer in the Azure portal for these metrics please?
@tomkerkhove, attached a screenshot of the metrics explorer on Azure portal:
Another example with the following metricDefaults
:
metricDefaults:
aggregation:
interval: 00:05:00
scraping:
schedule: "0 * * ? * *"
Logs:
[13:28:57 INF] Scraping Azure Monitor - 03/17/2021 13:28:57 +00:00
[13:28:57 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:28:57 INF] Adding Prometheus sink to expose on /metrics
[13:28:57 DBG] Failed to locate the development https certificate at 'null'.
[13:28:57 INF] Now listening on: http://[::]:8080
[13:28:57 INF] Application started. Press Ctrl+C to shut down.
[13:28:57 INF] Hosting environment: Production
[13:28:57 INF] Content root path: /app
[13:29:00 INF] Found value 4 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:29:00 INF] Scraping Azure Monitor - 03/17/2021 13:29:00 +00:00
[13:29:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:29:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:30:00 INF] Scraping Azure Monitor - 03/17/2021 13:30:00 +00:00
[13:30:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:30:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:31:00 INF] Scraping Azure Monitor - 03/17/2021 13:31:00 +00:00
[13:31:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:31:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:32:00 INF] Scraping Azure Monitor - 03/17/2021 13:32:00 +00:00
[13:32:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:32:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:33:00 INF] Scraping Azure Monitor - 03/17/2021 13:33:00 +00:00
[13:33:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:33:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:34:00 INF] Scraping Azure Monitor - 03/17/2021 13:34:00 +00:00
[13:34:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:34:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:35:00 INF] Scraping Azure Monitor - 03/17/2021 13:35:00 +00:00
[13:35:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:35:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
As you can see:
[13:28:57 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
...
[13:29:00 INF] Found value 4 for metric azure_service_bus_messages with aggregation interval 00:05:00
[13:29:00 INF] Scraping Azure Monitor - 03/17/2021 13:29:00 +00:00
[13:29:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[13:29:01 INF] Found value 3.2 for metric azure_service_bus_messages with aggregation interval 00:05:00
First scrapping is returning the proper value 4
and then it's returning 3.2
.
I though that it might be cause the 2 api calls are too close, but when trying with a fixed time in the schedule (so the time between the first scrap which I guess is part of a kind of initialisation and the next one is greater than 3 minutes, the results is the same)
Hm odd, is the metric explorer also using 00:05:00 as aggregation?
Frankly, we can only scrape the API with the same configuration but there is no data manipulation that we are doing.
Hm odd, is the metric explorer also using 00:05:00 as aggregation?
This is the metrics configuration:
metricDefaults:
aggregation:
interval: 00:05:00
scraping:
schedule: "0 * * ? * *"
metrics:
- name: azure_service_bus_messages
description: "Count of messages in a Queue/Topic."
resourceType: ServiceBusNamespace
labels:
env: dev
azureMetricConfiguration:
metricName: Messages
aggregation:
type: Average
resources:
- namespace: myNamespace
topicName: myTopic
I would need to dig into the api call, I saw in runtime/scraper#azure-monitor that's it's possible to get the Azure Monitor API call but didn't manage to make it work with my configuration, is there something wrong with the following config:
server:
httpPort: 8080 # Optional. Default: 80
metricSinks:
prometheusScrapingEndpoint:
metricUnavailableValue: NaN # Optional. Default: NaN
enableMetricTimestamps: false # Optional. Default: true
baseUriPath: /metrics # Optional. Default: /metrics
metricsConfiguration:
absolutePath: /config/metrics-declaration.yaml # Optional. Default: /config/metrics-declaration.yaml
telemetry:
containerLogs:
isEnabled: true # Optional. Default: true
verbosity: trace # Optional. Default: N/A
defaultVerbosity: Trace # Optional. Default: error
azureMonitor:
logging:
informationLevel: BodyAndHeaders # Optional. Default: Basic
isEnabled: true # Optional. Default: false
I was more referring to the aggregation in te metric explorer of the Azure Portal, can you check please?
I was more referring to the aggregation in the metric explorer of the Azure Portal, can you check please?
@tomkerkhove : Ah, yes the aggregation in the Azure Portal Metric explorer (called "time granularity" on the interface) is set to 5 minutes.
I've done some additional test and let promitor runs over night with the metrics configuration
metricDefaults:
aggregation:
interval: 00:05:00
scraping:
schedule: "0 * * * * *"
metrics:
- name: azure_service_bus_messages
description: "Count of messages in a Queue/Topic."
resourceType: ServiceBusNamespace
labels:
env: dev
azureMetricConfiguration:
metricName: Messages
aggregation:
type: Total
resources:
- namespace: myNamespace
topicName: myTopic
This is the Metric Explorer screenshot:
And this are the logs, we can see that the total
is oscillating between 20
and 25
despite the fact that the amount of messages is fixed to 5
(the proper total should be 25
):
[07:20:00 INF] Scraping Azure Monitor - 03/18/2021 07:20:00 +00:00
[07:20:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:20:00 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:21:00 INF] Scraping Azure Monitor - 03/18/2021 07:21:00 +00:00
[07:21:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:21:00 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:22:00 INF] Scraping Azure Monitor - 03/18/2021 07:22:00 +00:00
[07:22:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:22:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:23:00 INF] Scraping Azure Monitor - 03/18/2021 07:23:00 +00:00
[07:23:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:23:01 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:24:00 INF] Scraping Azure Monitor - 03/18/2021 07:24:00 +00:00
[07:24:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:24:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:25:00 INF] Scraping Azure Monitor - 03/18/2021 07:25:00 +00:00
[07:25:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:25:00 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:26:00 INF] Scraping Azure Monitor - 03/18/2021 07:26:00 +00:00
[07:26:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:26:00 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:27:00 INF] Scraping Azure Monitor - 03/18/2021 07:27:00 +00:00
[07:27:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:27:01 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:28:00 INF] Scraping Azure Monitor - 03/18/2021 07:28:00 +00:00
[07:28:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:28:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:29:00 INF] Scraping Azure Monitor - 03/18/2021 07:29:00 +00:00
[07:29:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:29:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:30:00 INF] Scraping Azure Monitor - 03/18/2021 07:30:00 +00:00
[07:30:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:30:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:31:00 INF] Scraping Azure Monitor - 03/18/2021 07:31:00 +00:00
[07:31:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:31:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:32:00 INF] Scraping Azure Monitor - 03/18/2021 07:32:00 +00:00
[07:32:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:32:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:33:00 INF] Scraping Azure Monitor - 03/18/2021 07:33:00 +00:00
[07:33:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:33:01 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:34:00 INF] Scraping Azure Monitor - 03/18/2021 07:34:00 +00:00
[07:34:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:34:01 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:35:00 INF] Scraping Azure Monitor - 03/18/2021 07:35:00 +00:00
[07:35:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:35:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:36:00 INF] Scraping Azure Monitor - 03/18/2021 07:36:00 +00:00
[07:36:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:36:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:37:00 INF] Scraping Azure Monitor - 03/18/2021 07:37:00 +00:00
[07:37:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:37:00 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:38:00 INF] Scraping Azure Monitor - 03/18/2021 07:38:00 +00:00
[07:38:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:38:00 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:39:00 INF] Scraping Azure Monitor - 03/18/2021 07:39:00 +00:00
[07:39:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:39:00 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:40:00 INF] Scraping Azure Monitor - 03/18/2021 07:40:00 +00:00
[07:40:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:40:00 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:41:00 INF] Scraping Azure Monitor - 03/18/2021 07:41:00 +00:00
[07:41:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:41:01 INF] Found value 25 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:42:00 INF] Scraping Azure Monitor - 03/18/2021 07:42:00 +00:00
[07:42:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:42:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
[07:43:00 INF] Scraping Azure Monitor - 03/18/2021 07:43:00 +00:00
[07:43:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[07:43:01 INF] Found value 20 for metric azure_service_bus_messages with aggregation interval 00:05:00
´´´
That is true, but the metrics explorer is using average, while you are scraping totals?
Next to that, can you compare it without filter on the subscription? Maybe that's causing an issue that I need to fix.
That is true, but the metrics explorer is using average, while you are scraping totals?
@tomkerkhove: The issue is the same with average but I wanted to try different aggregation type (on the explorer only average is available whereas you can retrieve Maximum, Total and Minimum from promitor).
As I mentioned earlier when trying the other Aggregation the results are also wrong e.g. with 4 messages in the queue:
Next to that, can you compare it without filter on the subscription? Maybe that's causing an issue that I need to fix.
I guess you mean something like removing the filter on the topic/queue like
metrics:
- name: azure_service_bus_messages
description: "Count of messages in a Queue/Topic."
resourceType: ServiceBusNamespace
labels:
env: dev
azureMetricConfiguration:
metricName: Messages
dimenstion:
name: EntityName
aggregation:
type: Average
resources:
- namespace: myNamespace
Issue is the same (I have 5 messages in the topic)
08:04:35 INF] Scraping Azure Monitor - 03/18/2021 08:04:35 +00:00
[08:04:35 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[08:04:35 INF] Adding Prometheus sink to expose on /metrics
[08:04:35 DBG] Failed to locate the development https certificate at 'null'.
[08:04:35 INF] Now listening on: http://[::]:8080
[08:04:35 INF] Application started. Press Ctrl+C to shut down.
[08:04:35 INF] Hosting environment: Production
[08:04:35 INF] Content root path: /app
...
[08:04:39 INF] Found value 5 for metric azure_service_bus_messages with dimension myTopic as part of entity_name dimension with aggregation interval 00:05:00
[08:05:00 INF] Scraping Azure Monitor - 03/18/2021 08:05:00 +00:00
[08:05:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
...
[08:05:03 INF] Found value 4 for metric azure_service_bus_messages with dimension myTopic as part of entity_name dimension with aggregation interval 00:05:00
[08:06:00 INF] Scraping Azure Monitor - 03/18/2021 08:06:00 +00:00
[08:06:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
...
[08:06:02 INF] Found value 4 for metric azure_service_bus_messages with dimension myTopic as part of entity_name dimension with aggregation interval 00:05:00
[08:07:00 INF] Scraping Azure Monitor - 03/18/2021 08:07:00 +00:00
[08:07:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
...
[08:07:02 INF] Found value 4 for metric azure_service_bus_messages with dimension myTopic as part of entity_name dimension with aggregation interval 00:05:00
[08:08:00 INF] Scraping Azure Monitor - 03/18/2021 08:08:00 +00:00
[08:08:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
...
[08:08:02 INF] Found value 4 for metric azure_service_bus_messages with dimension myTopic as part of entity_name dimension with aggregation interval 00:05:00
[08:09:00 INF] Scraping Azure Monitor - 03/18/2021 08:09:00 +00:00
[08:09:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
...
[08:09:01 INF] Found value 5 for metric azure_service_bus_messages with dimension myTopic as part of entity_name dimension with aggregation interval 00:05:00
It would be interesting to have the api call the azure monitor, this should be the configuration which allows to print to stdout the calls right:
server:
httpPort: 8080 # Optional. Default: 80
metricSinks:
prometheusScrapingEndpoint:
metricUnavailableValue: NaN # Optional. Default: NaN
enableMetricTimestamps: false # Optional. Default: true
baseUriPath: /metrics # Optional. Default: /metrics
metricsConfiguration:
absolutePath: /config/metrics-declaration.yaml # Optional. Default: /config/metrics-declaration.yaml
telemetry:
containerLogs:
isEnabled: true # Optional. Default: true
verbosity: trace # Optional. Default: N/A
defaultVerbosity: Trace # Optional. Default: error
azureMonitor:
logging:
informationLevel: BodyAndHeaders # Optional. Default: Basic
isEnabled: true # Optional. Default: false
I also notice for the other queue and topic in my namespace, all values are randomly oscillating at the same time between 2 states:
state1
: wrong value (if we assume the queues/topics sizes are constant it's like value = avg - avg/interval
)state2
: correct valueI didn't find any pattern in the oscillation it's random:
state1
state2
state1
state1
state1
state2
state1
state2
state1
state1
I'll have to take a look, sorry!
Just FYI, you should only rely on Average as that's the only supported aggregation: https://docs.microsoft.com/en-us/azure/azure-monitor/essentials/metrics-supported#microsoftservicebusnamespaces
Just FYI, you should only rely on Average as that's the only supported aggregation: https://docs.microsoft.com/en-us/azure/azure-monitor/essentials/metrics-supported#microsoftservicebusnamespaces
Sure, this is what I initially did, but as the values were not what I expected I tested the other Aggregation to understand what could be wrong.
I'll have to take a look, sorry!
Thanks :)
I did a little bit more investigation on the Azure Monitor API Level and did some test using (https://docs.microsoft.com/en-us/rest/api/monitor/metrics/list#code-try-0).
I'm assuming the Aggregation Interval time you're using is within the following format
Those are the parameters I used:
resourceUri: /subscriptions/XXXX/resourceGroups/XXXX/providers/Microsoft.ServiceBus/namespaces/myNamespace/
api-version: 2018-01-01
aggregation: average
metricnames: Messages
timespan: 2021-03-17T11:05:00.000Z/2021-03-17T11:10:00.000Z
$filter: EntityName eq 'scc-lt-deadletter'
I did this test and received 5 results and all values are correctsIf I change the timespan: 2021-03-17T11:05:00.000Z/2021-03-17T11:09:59.000Z
, I only received 4 results.
Thus I was wondering if the issue could be related to the AzureMonitorClient#GetClosestAggregationInterval() which is sometime providing a wrong Closest Aggregation time and thus instead of having 5 results there are only 4 or something like that.
I also performed a test using the following configuration:
metricDefaults:
aggregation:
interval: 00:01:00
scraping:
schedule: "0 * * * * *"
The results are also showing unexpected metrics:
16:12:45 INF] Content root path: /app
[16:12:49 INF] Found value 8 for metric azure_service_bus_messages with aggregation interval 00:01:00
[16:13:00 INF] Scraping Azure Monitor - 03/18/2021 16:13:00 +00:00
[16:13:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[16:13:01 INF] Found value 0 for metric azure_service_bus_messages with aggregation interval 00:01:00
[16:14:00 INF] Scraping Azure Monitor - 03/18/2021 16:14:00 +00:00
[16:14:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[16:14:01 INF] Found value 0 for metric azure_service_bus_messages with aggregation interval 00:01:00
[16:15:00 INF] Scraping Azure Monitor - 03/18/2021 16:15:00 +00:00
[16:15:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[16:15:02 INF] Found value 0 for metric azure_service_bus_messages with aggregation interval 00:01:00
[16:16:00 INF] Scraping Azure Monitor - 03/18/2021 16:16:00 +00:00
[16:16:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[16:16:03 INF] Found value 0 for metric azure_service_bus_messages with aggregation interval 00:01:00
[16:17:00 INF] Scraping Azure Monitor - 03/18/2021 16:17:00 +00:00
[16:17:00 INF] Scraping azure_service_bus_messages for resource type ServiceBusNamespace
[16:17:02 INF] Found value 0 for metric azure_service_bus_messages with aggregation interval 00:01:00
Thanks for digging into this!
Thus I was wondering if the issue could be related to the AzureMonitorClient#GetClosestAggregationInterval() which is sometime providing a wrong Closest Aggregation time and thus instead of having 5 results there are only 4 or something like that.
Can you talk a bit more about that please?
You're suggesting that the issue might be in https://github.com/tomkerkhove/promitor/blob/master/src/Promitor.Integrations.AzureMonitor/AzureMonitorClient.cs#L122?
I notice that for an aggregation interval of 5min
with a scraping of 1min
the following behaviour for a queue with constant number of messages (e.g. in our case let's take 4
).
average = avg-avg/interval
(e.g. 4-4/5 = 3.2
)total = total-total/interval
(e.g. 20-20/5 = 16
)minimum = 0
maximum = 4
When querying for a metric with an aggregation interval of 5min
, I expect promitor to send for example the following timespan AggregationInterval 2021-03-17T11:05:00.000Z/2021-03-17T11:10:00.000Z
and received the following data:
{
"id": "/subscriptions/XXXX/resourceGroups/xxx/providers/Microsoft.ServiceBus/namespaces/xxx/providers/Microsoft.Insights/metrics/Messages",
"type": "Microsoft.Insights/metrics",
"name": {
"value": "Messages",
"localizedValue": "Count of messages in a Queue/Topic."
},
"displayDescription": "Count of messages in a Queue/Topic.",
"unit": "Count",
"timeseries": [
{
"metadatavalues": [
{
"name": {
"value": "entityname",
"localizedValue": "entityname"
},
"value": "myTopic"
}
],
"data": [
{
"timeStamp": "2021-03-17T11:05:00Z",
"average": 4
},
{
"timeStamp": "2021-03-17T11:06:00Z",
"average": 4
},
{
"timeStamp": "2021-03-17T11:07:00Z",
"average": 4
},
{
"timeStamp": "2021-03-17T11:08:00Z",
"average": 4
},
{
"timeStamp": "2021-03-17T11:09:00Z",
"average": 4
}
]
}
],
"errorCode": "Success"
}
],
To summarise an array T1=[4-4-4-4-4]
which is then aggregated according to the aggregation type ( e.g. sum(T1), avg(T1), ,min(T1), max(T1))
My assumption is that somehow it's sending the timespan AggregationInterval 2021-03-17T11:05:00.000Z/2021-03-17T11:09:59.000Z
(some a time span < 5 min) and thus because of the time interval being < 5min Azure monitor is only returning 4 values. So the array is going maybe something like T2=[4-4-4-4-null]
maybe then null is then replace by 0, I don't know exactly and then all the computation are using this array to calculate the results.
That's why to confirm the behaviour, it would be good to be able to see the API Call in the console using but this is not properly working :( and I'm not an dotnet developper to make it run on my machine and debug it.
azureMonitor:
logging:
informationLevel: BodyAndHeaders # Optional. Default: Basic
isEnabled: true # Optional. Default: false
@tomkerkhove I've setup the project to debug in deep the application, and I guess I found the issue. The problem lies in the last interval retrieved on the Azure Monitoring call. Azure Monitoring is using the best effort to compute the last minute. As you can see in the diagram below the last minute is dropping to 0 because Azure Monitor didn't have the time to compute the last interval:
Looking at the issue on internet, I found this page azure-monitor/metrics-troubleshoot#chart-shows-unexpected-drop-in-values which is stating :
Depending on the service, the latency of processing metrics can be within a couple minutes range.
For charts showing a recent time range with a 1- or 5- minute granularity, a drop of the value over
the last few minutes becomes more noticeable:
And this is by design.
As a solution, there are two way:
Querying time-1 minute
:
private async Task<IMetric> GetRelevantMetric(string metricName, AggregationType metricAggregation, TimeSpan metricInterval,
string metricFilter, string metricDimension, IMetricDefinition metricDefinition, DateTime recordDateTime)
{
recordDateTime = recordDateTime.AddMinutes(-1);
var metricQuery = CreateMetricsQuery(metricAggregation, metricInterval, metricFilter, metricDimension, metricDefinition, recordDateTime);
Adapt the configuration scraping.schedule
to start as late as possible during the minutes example:
scraping:
schedule: "55 * * * * *"
e.g. If the scrap start at 11:00:55
for example, it will give Azure Monitor 55seconds to compute the data between 10:59:00-11:00:00
Both solutions might not work for all metrics as it written in azure-monitor/metrics-troubleshoot#chart-shows-unexpected-drop-in-values, that the latency of processing metrics can be within a couple minutes range.
and here we're only giving <1minute
.
I'm still looking into this but have to finish #444 first.
In meantime, I've checked and the Azure Monitor insights works but requires trace
verbosity, I've verified it and the docs state this: https://promitor.io/configuration/v2.x/runtime/scraper#azure-monitor
This relates to https://github.com/tomkerkhove/promitor/issues/1290 as well and would stick with aggregation of 5 minutes due to Azure Monitor but I'll think about this issue where you could configure that you want to ignore a time series (but when should we do that, what if it's really 0?)
Are you still seeing this?
Report
Using the latest docker image, I'm scrapping Service Bus metrics and counting messages in a Queue/Topic. The aggregation is not properly computed according to my expectation. Details: I'm using
metricDefaults.aggregation.interval
of00:05:00
andmetricDefaults.scraping.schedule
of0 * * ? * *
.Expected Behavior
My queue has a constant sized of
4
messages in the queue. I expected the aggregation to be for:Average
: 4Minimum
: 4Total
: 20Actual Behavior
My queue has a constant sized of
4
messages in the queue. I see the following aggregation:Average
: 3.2Min
: 0Total
: 16 Even after running the scrapper for 10 minutes, the results doesn't change. It's as if the aggregation was missing one value (e.g.[4-4-4-4-0]
instead of[4-4-4-4-4]
). When I'm usingMaximum
the value is correct.Steps to Reproduce the Problem
4
messages in an Azure Service Bus queue (without consuming them)Component
Scraper
Version
2.1.1
Configuration
runtime.yaml
metrics-declaration.yaml
Logs
Unfortunately I despite enabling
azureMonitor
in my runtime configuration I was not able to get the Azure Monitor API call logged (not sure why...) For Average:For Total
Platform
Other
Contact Details
tarantino.dev@gmail.com