Provide labels for Azure Tags per metric

tomkerkhove commented 5 years ago

We should determine if it's feasible and makes sense to provide labels for Azure Tags per metric.

Original suggestion was made in #462 by @brusMX .

Checklist

[ ] Provide integration
[ ] Provide capability to opt-in
[ ] Provide documentation on the feature
[ ] Extend metric labeling documentation (see #605)

jcorioland commented 5 years ago

Having a look at the code I think this is something that might be doable easily in the QueryMetricAsync method of AzureMonitorClient class. Like that:

// get tags
string resourceGroupName = "EXTRACT NAME FROM resourceId";
var resourceGroup = await _authenticatedAzureSubscription.ResourceGroups.GetByNameAsync(resourceGroupName);
 var tags = resourceGroup.Tags;

I am just wondering how we can add Tags property on the metric as you seem to use the object from the monitoring SDK, right?

Any thoughts on that @tomkerkhove ?

tomkerkhove commented 5 years ago

I've had a look and I can access the data very easily, however, you need to know what property to call so ever scraper has to do it.

Will be a good addition, but it will be opt-in and only for post v1.0.

Does that sound fair?

tomkerkhove commented 5 years ago

From a Promitor perspective, I think we should only get tags from the resource itself rather than the resource group.

That way we keep the required permissions scoped to what is essential and we get aimed tags.

jcorioland commented 5 years ago

Yep makes sense. And I checked yesterday, seems like the Monitoring Reader role is enough to get the tags, which is great in the case of Promitor :)

tomkerkhove commented 5 years ago

That is correct! Will add this tomorrow!

tomkerkhove commented 5 years ago

Figuring out how we can easily do this in a generic way via https://github.com/Azure/azure-libraries-for-net/issues/719

Worst case this is moved to a later version.

adamconnelly commented 5 years ago

I'd be up for taking a look at this, but I just wanted to explain my use case first, and clarify a bit more about how it could work.

Use Case

At work, we have multiple applications made up of different components (web server, database, redis cache, etc). We also have multiple environments (production, staging, etc). I'd like to be able to make sure that our metrics had a consistent set of labels on them:

environment
application
region
component

This will allow us to do a few things:

Create Grafana dashboards showing different parts of the same application.
Group alerts together, so that a problem in an SQL Server database that causes a knock-on error in the web application results in a single alert for the application, making it easier to diagnose problems.

With this labeling scheme, we might end up with a set of metrics like the following:

azure_sql_dtu_percent{environment="production", application="diary", region="euw", component="sql-server"} 54
azure_sql_dtu_percent{environment="production", application="diary", region="usnc", component="sql-server"} 43
azure_sql_dtu_percent{environment="production", application="login", region="euw", component="sql-server"} 49
azure_sql_dtu_percent{environment="staging", application="diary", region="euw", component="sql-server"} 23
azure_sql_dtu_percent{environment="staging", application="login", region="euw", component="sql-server"} 12
azure_redis_cache_hits{environment="production", application="diary", region="euw", component="redis"} 10000
azure_redis_cache_hits{environment="production", application="diary", region="usnc", component="redis"} 9500
azure_redis_cache_hits{environment="production", application="login", region="euw", component="redis"} 5609
azure_redis_cache_hits{environment="staging", application="diary", region="euw", component="redis"} 500
azure_redis_cache_hits{environment="staging", application="login", region="euw", component="redis"} 240

At the moment we can configure the component label in Promitor using the following:

name: azure_redis_cache_hits
description: "The number of successful key lookups ..."
resourceType: RedisCache
labels:
  component: redis
...
resources:
- cacheName: Promitor-1
- cacheName: Promitor-2

But the other labels would have to be set per resource, which would make the configuration difficult.

Possible Implementation

Configuration

We could alter the metrics configuration to allow users to opt-in to adding Azure tags as labels. I think it would make sense to either add this to the existing azureMetadata section, or to create a new azure section. Personally I'd put it into azureMetadata to keep the config as simple as possible unless you see a problem with that.

I think this feature should be opt-in, and we should be able to specify which labels we want to include, for example:

azureMetadata:
  tenantId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  subscriptionId: yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy
  resourceGroupName: promitor
  addLabelsForTags:
    tags:
    - environment
    - region
    - application
    - component

I can imagine that some people might have simple setups where they just want to add all the tags, in which case we could do something like this:

azureMetadata:
  tenantId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  subscriptionId: yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy
  resourceGroupName: promitor
  addLabelsForTags:
    includeAllTags: true

Implementation

I've got one main question about the implementation of this: should we get the tags once at startup, or during every scrape. The advantage of doing it once at startup would be that we'd avoid running an extra API request per scrape, but the disadvantage is that any changes to tags wouldn't be picked up until Promitor restarted. I don't think this would matter for my use case since our tags will be very static, but maybe that's a problem for some people?

Any thoughts on this?

tomkerkhove commented 5 years ago

Thanks for the suggestion and your usecase @adamconnelly!

I've been thinking about this as well and there are a couple of things:

This should be opt-in indeed and we should check what the required ARM role is to read them so we can document it.
People should be able to list the tags they want to use, similar to your case
People should be able to use all tags but list the ones to exclude
People should be able to configure a value to be used in case one resource of a list does not have the tag

This makes me thing in these terms:

Promitor runtime allows you to opt-in and configure a default value
Metric declaration allows you to define default include/exclude of tags, with override per metric

In terms of getting the tags, I would get them every time and see how things go. Over time we could still cache them or only fetch them once but then it becomes tricky as we might report stale data.

Now I would not implement everything as part of this feature, for example I would create a dedicated issue for excluding tags and one for annotating the ones to include rather than to add everything.

Or would annotating those that you want be the best approach to avoid having too many labels🤔 Now that I think of this I'm actually leaning toward whitelist only.

Runtime Configuration

server:
  httpPort: 80 # Optional. Default: 80
azure:
  tagEnrichment:
    notFoundValue: N/A # Optional. Default: N/A
    enableEnrichment: true # Optional. Default: false

Metric Declaration

So I'm more thinking along these lines, but I'm open for feedback:

version: v1
azureMetadata:
  tenantId: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  subscriptionId: yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy
  resourceGroupName: promitor
metricDefaults:
  aggregation:
    interval: 00:05:00
  scraping:
    # Every minute
    schedule: "0 * * ? * *"
  azureTags:
    include:
    - foo
    - bar
    exclude:
    - foo
    - bar
metrics:
  - name: azure_service_bus_active_messages
    description: "The number of active messages on a service bus queue."
    resourceType: ServiceBusQueue
    labels:
      app: promitor
      tier: messaging
    azureTags:
      include:
      - environment
      exclude:
      - deployedBy
      - version

adamconnelly commented 5 years ago

That makes sense to me. Couple of questions:

I take it the include and exclude lists are mutually exclusive?
I think it makes sense to allow this to be specified at the global, and per-metric level like you've suggested. Should we support only specifying it at the metric level without specifying a default (i.e. if you only wanted to pull in the tags for a single metric or a certain subset)? I don't think this would be a use-case for me, but I just want to get your thoughts.

tomkerkhove commented 5 years ago

I think having both would be ideal to avoid duplication, for example the environment Azure Tag would be on all resources rather than a handful.

The list would be mutually exclusive indeed.

I'll need to give the include/exclude some thought but for the this issue I'd say let's just add all of them.

adamconnelly commented 5 years ago

@tomkerkhove have you had any thoughts yet about how to get around https://github.com/Azure/azure-libraries-for-net/issues/719?

adamconnelly commented 5 years ago

In case this helps, I've done a bit of playing about, and found that we can get the resource using IGenericResources.Get(). For example:

_resourceManager.GenericResources.Get(ResourceUtils.GroupFromResourceId(resourceId),
                ResourceUtils.ResourceProviderFromResourceId(resourceId),
                ResourceUtils.ParentRelativePathFromResourceId(resourceId),
                ResourceUtils.ResourceTypeFromResourceId(resourceId),
                ResourceUtils.NameFromResourceId(resourceId),
                "2018-06-01-preview");

There's (at least!) a couple of problems / annoyances with that:

The method isn't async.
We'd need to specify an appropriate API version for each resource provider.

tomkerkhove commented 5 years ago

@adamconnelly I did some planning and would like to move this to v1.2, is that an issue for you? Is this a requirement on your end?

I want to ship v1.1 fairly soon and it feels like this is not well-supported yet and we might need to ask for better support

adamconnelly commented 5 years ago

That's no problem for me.

When I started to look at it I hadn't spotted the problem you'd found with the Azure client library. At the moment it looks like we either wait for better support, or we'd have to put a lot of effort in to work around it. Either way it's not gonna be a quick fix.

tomkerkhove commented 5 years ago

In that case I'd sit back and wait until there is better support, no pressure on this feature!

tomkerkhove commented 4 years ago

Ideally this is done by our new Resource Discovery Agent which will give the information to Promitor (see: https://github.com/tomkerkhove/promitor/issues/512)

taruna-verma commented 3 years ago

Is this feature implemented? I am also in need of adding the azure tags in the metrics dimension. Please help.

tomkerkhove commented 3 years ago

This is not supported yet, unfortunately.

Would you prefer to have this for all metrics or only when using resource discovery?

taruna-verma commented 3 years ago

All the metrics is preferred but good to have for both.

taruna-verma commented 3 years ago

Can you please provide some tips, if I want to implement this for all the metrics in the scraper?

tomkerkhove commented 3 years ago

Can you please provide some tips, if I want to implement this for all the metrics in the scraper?

Do you mean to have it up and running or contribute the feature itself?

taruna-verma commented 3 years ago

Can you please provide some tips, if I want to implement this for all the metrics in the scraper?

Do you mean to have it up and running or contribute the feature itself?

I would love to contribute on this feature, please provide me details.

tomkerkhove commented 3 years ago

That would be great, thanks! Let's focus on resource discovery first now, and then we can add it to declarative metrics later on.

I'll circle back on this later on

tomkerkhove commented 3 years ago

The resource discovery group should have a new field called enrichments which allows you to opt-in for tag enrichment when enabled: true:

version: v1
azureLandscape:
  tenantId: c8819874-9e56-4e3f-b1a8-1c0325138f27
  subscriptions:
  - SUBSCRIPTON-ID-ABC
  cloud: China
resourceDiscoveryGroups:
- name: container-registry-landscape
  type: ContainerRegistry
- name: filtered-logic-apps-landscape
  type: LogicApp
  enrichments:
    tags:
      enabled: true
      scope: [resource, resourceGroup]
      exclude:
      - app
  criteria:
    include:
      resourceGroups:
      - promitor-resource-group-1
      - promitor-resource-group-2
      tags:
        app: promitor
        region: europe

Bonus points if we allow exclusion by defining the tag names through exclude and/or define the scope to look for the tags on either resource and/or resourceGroup.

The API endpoint should introduce a new field called Dimensions:

[
    {
        "$type": "Promitor.Core.Contracts.ResourceTypes.ServiceBusNamespaceResourceDefinition, Promitor.Core.Contracts",
        "Namespace": "promitor-messaging-other-subscription",
        "QueueName": "",
        "TopicName": "",
        "ResourceType": "ServiceBusNamespace",
        "SubscriptionId": "0329dd2a-59dc-4493-aa54-cb01cb027dc2",
        "ResourceGroupName": "promitor-sources",
        "ResourceName": "promitor-messaging-other-subscription",
        "UniqueName": "promitor-messaging-other-subscription--",
        "Dimensions":{
            "region": "europe"
        }
    }
]

tomkerkhove commented 3 years ago

Extending the example above, we should:

Support resourceDiscoveryGroups.[].enrichments.tags.include
Remove resourceDiscoveryGroups.[].enrichments.scope as we will only support resources for now
Provide capability to configure the default value if it was not found (default: N/A)

However, next to the approach above we will provide system metrics on the discovery agent for every single resource so that you can use a more centralized view as well.

KeyanatGiggso commented 2 years ago

What's the status on this @tomkerkhove ? Any good News for this Improvement ?

tomkerkhove commented 2 years ago

Not at the moment but I'm open to contributions.

tomkerkhove / promitor