Azure / ARO-RP

Azure Red Hat OpenShift RP
https://azure.microsoft.com/products/openshift/
Apache License 2.0
101 stars 170 forks source link

Allow customers to specify tags to be set on CRG at create time #359

Open jim-minter opened 4 years ago

jim-minter commented 4 years ago

Helps when customers have a policy requiring tags to be set and otherwise get a RequestDisallowedByPolicy error.

Probably requires an API change, low priority for now.

jim-minter commented 4 years ago

from #720:

The issue is unclear - what customers want is a way to tag all resources under the CRG, and maybe the CRG as well (don't know). That is fiddly because it implies changes to the machine API so that worker VMs are tagged, and tag update is even worse. It is also not necessarily true that a customer would want the tags on the openShiftClusters resource to match those of all of the other resources. There'd need to be another field added to the API hence the API change.

m1kola commented 4 years ago

Today we received another incident where cluster creation failed due to RequestDisallowedByPolicy: a policy requires tags on a resource group to be set.

m1kola commented 4 years ago

Workaround description below. This addresses specific case when a policy checks for a tag on resource group, but resources in the resoruce group.

It shoukd be possible to use a similar workaround in case of policy that checks for tags on resources: we will have to use something like this instead of the rule in the workaround description:

{
    "value": "[resourceGroup().name]",
    "like": "aro-*"
},

When a customer creates an ARO cluster, ARO resource provider creates a cluster resource group that will contain cluster VMs, load balancers, and other resources. When a subscription has a policy that denies resource groups without a specific tag, cluster creation fails on this step.

I think the workaround is to add an exception to the policy rule that checks tags on resource groups.

The customer will need to modify their policy definition so that the policyRule excludes ARO cluster resource groups by pattern. It should be something similar (exact definition will depend on how the rest of the policy defined):

{
    "field": "name",
    "notLike": "aro-*"
},

By default ARO resource provider creates cluster resource groups with aro- prefix (might change in the future). The rule above excludes these resource groups.

Note that it is possible to explicitly provide a name for the cluster resource group when running az aro create using the --cluster-resource-group flag. So after applying the above policy rule commands like az aro create --cluster-resource-group my-group-name will still be failing.

Also note that it is going to be possible for users to manually create resource groups with aro- prefix and skip the policy.

redhatstuart commented 4 years ago

I now have a customer to which this is a blocker for them deploying ARO.

Succeed: az group update -g aro-stkirk-bgihi-eastus --tags "ARO 4.4.17 Build Date=20200913-21243220200913-212432" Fail: az group update -g aro-stkirk-bgihi-eastus-cluster --tags "ARO 4.4.17 Build Date=20200913-21243220200913-212432"

Error: The client 'stkirk at microsoft.com' with object id 'X' has permission to perform action 'Microsoft.Resources/subscriptions/resourcegroups/write' on scope '/subscriptions/X/resourcegroups/aro-stkirk-bgihi-eastus-cluster'; however, the access is denied because of the deny assignment with name 'X' and Id 'X' at scope '/subscriptions/X/resourcegroups/aro-stkirk-bgihi-eastus-cluster'.

chrisholcomb-usa commented 4 years ago

Hello, our Azure Cloud team has been asked to deploy ARO clusters which have been successfully built. An issue we are encountering is that we are unable to tag the ARO cluster resource group or resources in the cluster. This is causing us issues as we have mandates to provide our finance team with accurate cost data that requires a set of tags to accomplish. We have been utilizing Power BI to create multiple reports however, since ARO is unable to be tagged we are not able to capture the clusters expenditures to properly align them with financial buckets. We can create Power BI functions to do this but the process is cumbersome at best which still leaves anyone viewing cost management data in the Azure portal with bad data. How can the ARO clusters be tagged?

asalkeld commented 4 years ago

Hello, our Azure Cloud team has been asked to deploy ARO clusters which have been successfully built. An issue we are encountering is that we are unable to tag the ARO cluster resource group or resources in the cluster. This is causing us issues as we have mandates to provide our finance team with accurate cost data that requires a set of tags to accomplish. We have been utilizing Power BI to create multiple reports however, since ARO is unable to be tagged we are not able to capture the clusters expenditures to properly align them with financial buckets. We can create Power BI functions to do this but the process is cumbersome at best which still leaves anyone viewing cost management data in the Azure portal with bad data. How can the ARO clusters be tagged?

Hi @chrisholcomb-usa Would tagging just the resource group work for you? Or do you need to tag all resources within as well?

chrisholcomb-usa commented 4 years ago

Hello,

Yes, tagging the resource groups would suffice as we are building additional reports to tag RGs instead of all individual resources. Tagging the RGs would be a huge win.

Regards,

Chris Holcomb Digital Infrastructure Guidance

mjudeikis commented 4 years ago

Lets put this into perspective why I closed the PR.

First, this would translate existing tags to cluster resource group. This would not enable customer to modify, update or remove tags. This is only one way process and is more a hack than a solution. Which is shortsighted. At the same time where will be always customer somewhere who will want different tags and having the same immutable tags will break them. And it is not possible to retrofit this to all existing clusters in production. This basically changes existing create behaviour, which is antipatterns and might be considered a regression.

The alternative is to introduce a new API field. Something like properties.ClusterTags and properties.Tags where the cluster can choose tags for their different resource groups. But this has more problems on how to lifecycle them.

An even better solution would be to enable customers to add tags post create. So they can be modified, changed, retrofitted as fit by customer.

We have short discussions internally and we gonna try to pursuit last option (enable apply tags post create), and if this option will not be possible (due to lack of granular permissions to enable this on azure side), we gonna try to do new api fields for this.

Both options will need to wait for a bit ;/ sorry for the false hope. I have created internal story for this, as it theory it is quite a small amount of work I expect this to happen sooner than later, but for now, we gonna have to sit on this :/

mjudeikis commented 3 years ago

So the option to enable this post creation does not work due to fact that we need more granular level permissions.
The main issue is that OpenShift does not support tags. In example, if you tag everything in the cluster resource group, but machine API will provision new machines - those will not be tagged. Second tagging is used to track disk, loadbalancers by cloud-provider. This metadata can be corrupted and user can have lost disk or even data loss.

Overall I think Openshift installer should manage tags and propagate them, not RP.

lgmorand commented 3 years ago

@mjudeikis any update on this ? in my case, not having on the managed RG is a blocker. I tried a undocumented workaround to re-use ARO SPN and az update RG but even with that I get a deny, unsufficient result.

I agree with your point regarding Openshift Installer but can't we have a "azure way" to do it until they do it properly ?

mjudeikis commented 3 years ago

@lgmorand saddly "azure way" and kubernetes/openshift does not align well and there is some issues. And we are not willing to take stability issues regards this. main being that users would be able to modify metadata from cloud provider, or that some objects will not be propagating tags.

So its a feature we will be waiting from upstream and adopting .

@jboutaud one of the questions we keep getting :) ^^^

HoLengZai commented 3 years ago

So the option to enable this post creation does not work due to fact that we need more granular level permissions. The main issue is that OpenShift does not support tags. In example, if you tag everything in the cluster resource group, but machine API will provision new machines - those will not be tagged. Second tagging is used to track disk, loadbalancers by cloud-provider. This metadata can be corrupted and user can have lost disk or even data loss.

Overall I think Openshift installer should manage tags and propagate them, not RP.

Hi @mjudeikis, @lgmorand works with me for my case... We have an azure policy to let resources created inside its resource group inherit its tags. So even the machine API will provision new machine, the azure policy will push tags to the new machine automatically from the cluster RG tags. And I can confirm that our Azure Policy to inherit tags from RG works as we have another Azure Policy that automatically adds the tag "Creation_Date" to all resources created in Azure.

image

Hence this Tag (Creation_Date) is also added on all the resources inside this RG too.

Anyway, possible to add a simple azure cli parameter like "--tags-cluster-resource-group" in the meantime until MS and OpenShift find an agreement to handle it for the customers?

Thanks

mjudeikis commented 3 years ago

@HoLengZai Im interested to understand what permissions allows you to push tags into Deny Assignment locked resource group? Can you elaborate on this a bit as this should not be possible.

HoLengZai commented 3 years ago

Hi @mjudeikis ,

We use the following definition on our Azure Policy for all our subscriptions, I guess the Deny assignment is applied after the cluster creation.

{
  "properties": {
    "displayName": "ISSC-GCM - Add tag Creation_Date with the current time to new resources",
    "policyType": "Custom",
    "mode": "All",
    "description": "Add the tag Creation_Date = utcNow() while creating new ressources with the ISO 8601 DateTime format ‘yyyy-MM-ddTHH:mm:ss.fffffffZ’",
    "metadata": {
      "category": "Tags",
      "createdBy": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
      "createdOn": "2020-06-23T10:41:14.7425413Z",
      "updatedBy": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
      "updatedOn": "2021-03-12T15:48:21.6538752Z"
    },
    "parameters": {},
    "policyRule": {
      "if": {
        "allOf": [
          {
            "field": "tags['Creation_Date']",
            "exists": "false"
          }
        ]
      },
      "then": {
        "effect": "modify",
        "details": {
          "operations": [
            {
              "operation": "add",
              "field": "tags['Creation_Date']",
              "value": "[utcNow()]"
            }
          ],
          "roleDefinitionIds": [
            "/providers/Microsoft.Authorization/roleDefinitions/xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
          ]
        }
      }
    }
  },
  "id": "/providers/Microsoft.Management/managementGroups/xxxxxxxxxxxxxx/providers/Microsoft.Authorization/policyDefinitions/xxxxxxxxxxxxxxxxx",
  "type": "Microsoft.Authorization/policyDefinitions",
  "name": "xxxxxxxxxxxxxxxxxxxxxx"
}
mjudeikis commented 3 years ago

@HoLengZai , deny assignment is there from the beginning. Let me get this right, you are saying that with this policy in place all resources has Creation_Date tag? Including loadbalancers, disk, VMS? If this a case can you please send me your cluster clusterResourceID to {redacted out } so I can check this how does this looks like from our backends to see if there might be other ways to do this, as as per my understanding this was not possible due to lack of https://docs.microsoft.com/en-us/azure/azure-resource-manager/management/tag-resources#required-access (Microsoft.Resources/tags) which we intentionally do not grant.

lgmorand commented 3 years ago

@mjudeikis I sent you these info on your email this morning. (ps: should remove your email from here no ?)

lgmorand commented 3 years ago

we thought that you added this months ago: https://github.com/Azure/ARO-RP/pull/1059

mjudeikis commented 3 years ago

@lgmorand #1059 was rolledback as it was not working as expected due to deny assignment. I will respond to mail :) thanks

rhummelmose commented 3 years ago

Guys what's going on here? :)

mjudeikis commented 3 years ago

We are still looking. It is getting bigger priority now from PO team. Challenge still is poor azure RBAC model. We can't allow Tagging on individual resources or cascading one. It just not possible inside azure permission models.

There is this But adding this to Deny Assignment allows tagging Resource group ONLY and is not propagated to other resources. But Documentation states differently.

You can have write access to the Microsoft.Resources/tags resource type. This access lets you tag any resource, even if you don't have access to the resource itself.

We are seeking explanation from ARM team on this.

rhummelmose commented 3 years ago

@mjudeikis Thanks for the explanation