cloud-custodian / cloud-custodian

Rules engine for cloud security, cost optimization, and governance, DSL in yaml for policies to query, filter, and take actions on resources
https://cloudcustodian.io
Apache License 2.0
5.47k stars 1.49k forks source link

Azure: Not able to deploy custodian using azure function (first time deployment) #6520

Open aakifshaikh opened 3 years ago

aakifshaikh commented 3 years ago

Describe the bug I am attempting to deploy an Azure Function Hosting policy following the information from the documentation. I am trying to deploy with a service principal configured using the environment variables AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, AZURE_SUBSCRIPTION_ID, AZURE_TENANT_ID, but during deployment. I am receiving the following error-

(custodian) XYZ@ cloud-custodian % custodian validate sec-n-resourcegroup-orphaned.yml 
2021-03-10 20:14:57,657: custodian.azure.policy.AzureFunctionMode:ERROR policy:sec-n-resourcegroup-orphaned function policies should use UserAssigned Identities see https://cloudcustodian.io/docs/azure/configuration/functionshosting.html#authentication-options
2021-03-10 20:14:57,657: custodian.commands:INFO Configuration valid: sec-n-resourcegroup-orphaned.yml
(custodian) XYZ@ cloud-custodian % custodian run --output-dir=. sec-n-resourcegroup-orphaned.yml 
2021-03-10 20:15:55,078: custodian.azure.policy.AzureFunctionMode:ERROR policy:sec-n-resourcegroup-orphaned function policies should use UserAssigned Identities see https://cloudcustodian.io/docs/azure/configuration/functionshosting.html#authentication-options
2021-03-10 20:15:55,103: custodian.azure.session:INFO Authenticated [Azure CLI | af-b2xxxxxxxxxxa67ec0]
2021-03-10 20:15:55,104: custodian.azure.policy.AzureFunctionMode:ERROR policy:sec-n-resourcegroup-orphaned function policies should use UserAssigned Identities see https://cloudcustodian.io/docs/azure/configuration/functionshosting.html#authentication-options
2021-03-10 20:15:55,104: custodian.commands:ERROR Error while executing policy sec-n-resourcegroup-orphaned, continuing
Traceback (most recent call last):
  File "/Users/axxxxxxx/Dxxxxxxs/CxxxxxxxxAN/cloud-custodian/c7n/commands.py", line 271, in run
    policy()
  File "/Users/axxxxxx/Dxxxxxxxs/CLxxxxxxxN/cloud-custodian/c7n/policy.py", line 1182, in __call__
    resources = mode.provision()
  File "/Users/axxxx/Dxxxxnts/CLxxxxxxN/cloud-custodian/tools/c7n_azure/c7n_azure/policy.py", line 384, in provision
    super(AzurePeriodicMode, self).provision()
  File "/Users/axxxxx/Doxxxxxts/CLxxxxxxN/cloud-custodian/tools/c7n_azure/c7n_azure/policy.py", line 261, in provision
    session.get_functions_auth_string("")
  File "/Users/axxxx/Doxxxxs/CLxxxxxxxIAN/cloud-custodian/tools/c7n_azure/c7n_azure/session.py", line 289, in get_functions_auth_string
    raise NotImplementedError(
NotImplementedError: Service Principal credentials are the only supported auth mechanism for deploying functions.

- name: sec-n-resourcegroup-orphaned
  resource: azure.resourcegroup
  description: Find all Resource Groups that have no resources in the subscription. This policy runs every day at 9:30AM to check for the orphaned resources.
  filters:
    - type: empty-group
  mode:
    schedule: 0 30 9 * * *
    type: azure-periodic
    provision-options:
      identity:
        type: Embedded
    execution-options:
      output_dir: azure://cloudcustxxxxxxxxxxet/logs/{policy_name}/{now:%Y/%m/%d/%H/}
  actions:
    - type: logic-app
      resource-group: cloud-custodian            
      logic-app-name: custodian-notifications    
      batch: false                               
      body: >                                    
          {
          PolicyName: policy.name,
          PolicyDescription: policy.description,
          Resource: resource.
              {
              Name: name,
              Location: location,
              Owner: tags.owner,
              VmSize: properties.hardwareProfile.vmSize
              }
          }```

```policies:

- name: sec-n-resourcegroup-orphaned
  resource: azure.resourcegroup
  description: Find all Resource Groups that have no resources in the subscription. This policy runs every day at 9:30AM to check for the orphaned resources.
  filters:
    - type: empty-group
  mode:
    schedule: 0 30 9 * * *
    type: azure-periodic
    execution-options:
      output_dir: azure://cloudcuxxxxxxxxxxxxet/logs/{policy_name}/{now:%Y/%m/%d/%H/}
  actions:
    - type: logic-app
      resource-group: cloud-custodian            
      logic-app-name: custodian-notifications    
      batch: false                               
      body: >                                    
          {
          PolicyName: policy.name,
          PolicyDescription: policy.description,
          Resource: resource.
              {
              Name: name,
              Location: location,
              Owner: tags.owner,
              VmSize: properties.hardwareProfile.vmSize
              }
          }'

Custodian version adal==1.2.6 applicationinsights==0.11.9 APScheduler==3.7.0 argcomplete==1.12.2 attrs==20.3.0 azure-cli-core==2.20.0 azure-cli-telemetry==1.0.6 azure-common==1.1.26 azure-core==1.12.0 azure-cosmos==3.2.0 azure-cosmosdb-nspkg==2.0.2 azure-cosmosdb-table==1.0.6 azure-functions==1.6.0 azure-graphrbac==0.61.1 azure-keyvault==1.1.0 azure-mgmt-apimanagement==0.1.0 azure-mgmt-applicationinsights==0.2.0 azure-mgmt-authorization==0.60.0 azure-mgmt-batch==7.0.0 azure-mgmt-cdn==4.0.0 azure-mgmt-cognitiveservices==5.0.0 azure-mgmt-compute==10.0.0 azure-mgmt-containerinstance==1.5.0 azure-mgmt-containerregistry==2.8.0 azure-mgmt-containerservice==8.3.0 azure-mgmt-core==1.2.2 azure-mgmt-cosmosdb==0.11.0 azure-mgmt-costmanagement==0.1.0 azure-mgmt-databricks==0.1.0 azure-mgmt-datafactory==0.8.0 azure-mgmt-datalake-nspkg==3.0.1 azure-mgmt-datalake-store==0.5.0 azure-mgmt-dns==3.0.0 azure-mgmt-eventgrid==2.2.0 azure-mgmt-eventhub==3.1.0 azure-mgmt-hdinsight==1.7.0 azure-mgmt-iothub==0.10.0 azure-mgmt-keyvault==1.1.0 azure-mgmt-logic==3.0.0 azure-mgmt-managementgroups==0.2.0 azure-mgmt-monitor==0.7.0 azure-mgmt-network==9.0.0 azure-mgmt-nspkg==3.0.2 azure-mgmt-policyinsights==0.4.0 azure-mgmt-rdbms==1.9.0 azure-mgmt-redis==6.0.0 azure-mgmt-resource==6.0.0 azure-mgmt-resourcegraph==2.0.0 azure-mgmt-search==2.1.0 azure-mgmt-sql==0.16.0 azure-mgmt-storage==7.2.0 azure-mgmt-subscription==0.5.0 azure-mgmt-web==0.44.0 azure-nspkg==3.0.2 azure-storage-blob==2.1.0 azure-storage-common==2.1.0 azure-storage-file==2.1.0 azure-storage-queue==2.1.0 bcrypt==3.2.0 boto3==1.17.23 botocore==1.20.23 -e git+https://github.com/cloud-custodian/cloud-custodian.git@3xxxxxxxxxxx4e9ae65f#egg=c7n -e git+https://github.com/cloud-custodian/cloud-custodian.git@368xxxxxxxx65f#egg=c7n_azure&subdirectory=tools/c7n_azure c7n-gcp==0.4.8 c7n-mailer==0.6.9 cachetools==4.1.1 certifi==2020.12.5 cffi==1.14.4 chardet==4.0.0 click==7.1.2 colorama==0.4.4 cryptography==3.4.6 datadog==0.34.1 decorator==4.4.2 distlib==0.3.1 google-api-core==1.23.0 google-api-python-client==1.12.8 google-auth==1.23.0 google-auth-httplib2==0.0.4 google-cloud-core==1.4.4 google-cloud-logging==1.15.1 google-cloud-monitoring==0.34.0 google-cloud-storage==1.33.0 google-crc32c==1.0.0 google-resumable-media==1.1.0 googleapis-common-protos==1.52.0 grpcio==1.34.0 httplib2==0.18.1 humanfriendly==9.1 idna==2.10 importlib-metadata==3.7.2 isodate==0.6.0 Jinja2==2.11.2 jmespath==0.10.0 jsonpatch==1.28 jsonpickle==1.3 jsonpointer==2.0 jsonschema==3.2.0 knack==0.8.0rc2 ldap3==2.8.1 MarkupSafe==1.1.1 msal==1.10.0 msrest==0.6.21 msrestazure==0.6.4 netaddr==0.7.20 oauthlib==3.1.0 paramiko==2.7.2 pkginfo==1.7.0 portalocker==1.7.1 protobuf==3.14.0 psutil==5.8.0 pyasn1==0.4.8 pyasn1-modules==0.2.8 pycparser==2.20 Pygments==2.8.1 PyJWT==1.7.1 PyNaCl==1.4.0 pyOpenSSL==20.0.1 pyrsistent==0.17.3 python-dateutil==2.8.1 python-http-client==3.3.1 pytz==2020.4 PyYAML==5.4.1 ratelimiter==1.2.0.post0 redis==3.5.3 requests==2.25.1 requests-oauthlib==1.3.0 retrying==1.3.3 rsa==4.6 s3transfer==0.3.4 sendgrid==6.4.8 six==1.15.0 starkbank-ecdsa==1.1.0 tabulate==0.8.9 typing-extensions==3.7.4.3 tzlocal==2.1 uritemplate==3.0.1 urllib3==1.26.3 zipp==3.4.0

(custodian) xxxxxx@XYZ tools % custodian version
0.9.11
stefangordon commented 3 years ago

Hi @aakifshaikh

In the current release embedded credentials are disabled as it is significantly more secure to use User Assigned Identities. It looks like we might need to do some documentation cleanup on this.

You can go create your UAI in the portal and assign the appropriate roles, then deploy your function again with custodian and an updated policy YAML.

Creating the UAI: https://docs.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/how-to-manage-ua-identity-portal

You'll update your policy YAML with the identity name as shown in the first example here: https://cloudcustodian.io/docs/azure/configuration/functionshosting.html#authentication-options

With this model you have no credentials committed to the Azure Function or associated zip files which could be leaked, and you also reduce the credential requirements for the user deploying the function.

Let me know if you run into problems with deploying this way, any feedback will help with updating the documentation.

aakifshaikh commented 3 years ago

This newly created user assigned identity - must have the following roles- Contributor, Blob Data Contributor, Queue Data Contributor? I was able to successfully deployed the first policy to azure after following your instructions- Thank you! However, I am not able to see the resources.json (output file from custodian). In my machine the output-dir is empty- It created a folder of the policy name but it's empty. However, in console, I see a lot of orphaned resourcegroup. But the custodian is not picking those. Not sure where I am going wrong.

stefangordon commented 3 years ago

So I would expect it to be writing output to that storage account if its configured properly and has blob data contributor for it.

azure://cloudcuxxxxxxxxxxxxet/logs/{policy_name}/{now:%Y/%m/%d/%H/}

Of course it will only run at 9:30 each day, or whatever your CRON schedule is set for.

aakifshaikh commented 3 years ago
(custodian) axxxx@xxxxxx cloud-custodian % custodian report --output-dir=. --format grid sec-n-resourcegroup-orphaned.yml                            
2021-03-11 12:47:41,378: custodian.azure.session:INFO Authenticated [Azure CLI | 1xxxxxxxxxx7ec0]
+--------+------------+
| name   | location   |
+========+============+
+--------+------------+
(custodian) axxxx@xxxxxx cloud-custodian % custodian report --output-dir=. --format grid hyg-n-resourcegroup-missing-tag.yml 
2021-03-11 12:58:22,496: custodian.azure.session:INFO Authenticated [Azure CLI | 000xxxxxxxc0]
+--------+------------+
| name   | location   |
+========+============+
+--------+------------+

I tried the custodian report and it is still not showing any results there. 1) I checked the storage account- it has a blob contributor and queue contributor role as mentioned in the documentation. 2) My policy also has a correct statement of the output-dir path- that you mentioned above. 3) I saw the custodian created another storage account the first time it ran and under the container, it has created the following - azure-webjobs-hosts, azure-webjobs-secrets, scm-releases 4) The strorage account- azure://cloudcuxxxxxxxxxxxxet/logs/ does not have anything. I even tried running policy every 5 minutes for last 30 mins and still don't see any output results

(custodian) xxxx@xxxxx cloud-custodian % custodian run --output-dir=. sec-n-resourcegroup-orphaned.yml
2021-03-11 12:47:14,511: custodian.azure.session:INFO Authenticated [Azure CLI | 00xxxxxx]
2021-03-11 12:47:15,916: custodian.azure.deployment_unit.DeploymentUnit:INFO Found Function Application "sec-n-resourcegroup-orphaned-cxxxxx7".
2021-03-11 12:47:16,019: custodian.azure.policy.AzureFunctionMode:INFO Building function package for sec-n-resourcegroup-orphaned-cxxxx7
2021-03-11 12:47:16,443: custodian.azure.policy.AzureFunctionMode:INFO Function package built, size is 595KB
2021-03-11 12:47:16,443: custodian.azure.function_app_utils:INFO Publishing Function application
2021-03-11 12:47:22,692: custodian.azure.function_package.FunctionPackage:INFO Publishing Function package from /var/folders/kq/vs40cs7n14sfxxxxxxxx000gp/T/tmxxxxxe
2021-03-11 12:47:23,669: custodian.azure.function_package.FunctionPackage:INFO Function publish result: 202
2021-03-11 12:47:23,669: custodian.azure.function_app_utils:INFO Finished publishing Function application
aakifshaikh commented 3 years ago

I noticed that the app function that the custodian has created for that policy- Appfunction/functions/integration- shows the workflow and I see a blank for output. see attachment-

Screen Shot 2021-03-11 at 1 08 32 PM

stefangordon commented 3 years ago

It sounds like there may be some error preventing it from actually executing. If you click into the Monitor option on the left there in your picture then you should see all the executions and if you click into them perhaps we will find a clue.

You should note the data in the Monitor view is ~5 minutes late.

aakifshaikh commented 3 years ago

So I deleted the existing function and ran the "custodian run" command again. It created a new function and this time it got connected to the storage account (output-dir). However, the resources.json file is empty.

(custodian) xxxxx@xxxxxx cloud-custodian % custodian report --output-dir=. --format grid sec-n-resourcegroup-orphaned.yml
2021-03-11 14:45:30,298: custodian.azure.session:INFO Authenticated [Azure CLI | 0000xxxxxx0]
+--------+------------+
| name   | location   |
+========+============+
+--------+------------+

Storage Account Screenshot Screen Shot 2021-03-11 at 2 52 00 PM

AppFunction Screenshot Screen Shot 2021-03-11 at 2 47 20 PM

aakifshaikh commented 3 years ago

Anything wrong with my policy-I sees the storage account is getting updated every 5mins with a new resources.json file- but it is empty. Even tried with --format grid -not showing up any results.

vars:
  absent-tags-filter: &absent-tags
    - "tag:owner": absent
    - "tag:service": absent

policies:

- name: hyg-n-resourcegroup-missing-tag
  resource: azure.resourcegroup
  description: Find all Resource Groups that does not meet the mandatory tagging requirements (owner, service).
  filters:
    - or: *absent-tags
  mode:
    type: azure-periodic
    schedule: 0 */5 * * * *
    provision-options:
      identity:
        type: UserAssigned
        id: xxxxxxxx
    execution-options:
      output_dir: azure://cloxxxxxxxxnet/lxxs/{policy_name}/{now:%Y/%m/%d/%H/}
  actions:
    - type: logic-app
      resource-group: cloud-xxxxxn            
      logic-app-name: custodian-noti-xxxx-xxxxs    
      batch: false                               
      body: >                                    
          {
          PolicyName: policy.name,
          PolicyDescription: policy.description,
          Resource: resource.
              {
              Name: name,
              Location: location,
              Owner: tags.owner,
              VmSize: properties.hardwareProfile.vmSize
              }
          }
policies:

- name: sec-n-resourcegroup-orphaned
  resource: azure.resourcegroup
  description: Find all Resource Groups that have no resources in the subscription. 
  filters:
    - type: empty-group
  mode:
    type: azure-periodic
    schedule: 0 */5 * * * *
    provision-options:
      identity:
        type: UserAssigned
        id: xxxxxxxx
    execution-options:
      output_dir: azure://cxxxxxxnet/lxxxx/{policy_name}/{now:%Y/%m/%d/%H/}
  actions:
    - type: logic-app
      resource-group: cloud-cust-xxxx-xxxx            
      logic-app-name: custodian-noti-xxxx    
      batch: false                               
      body: >                                    
          {
          PolicyName: policy.name,
          PolicyDescription: policy.description,
          Resource: resource.
              {
              Name: name,
              Location: location,
              Owner: tags.owner,
              VmSize: properties.hardwareProfile.vmSize
              }
          }
stefangordon commented 3 years ago

sec-n-resourcegroup-orphaned looks correct and it appears to be running successfully (but finding no resources).

So I would guess either: 1) There are no empty resource groups or 2) It does not have permissions to see them

You can create a new empty resource group for testing, and also go verify that the identity has reader or contributor for the whole subscription.

If you remove the mode section from the policy and run it at the command line does it find any results?

One other thing that can happen and be confusing is that Azure does very long caching on user identities. If you added the roles to the identity after the function was already deployed it could take many hours to work. I am not sure if this still happens with Consumption function plans - I guess it could be unpredictable.

aakifshaikh commented 3 years ago

1) Notification piece in Logic App: I know the template looks for Resource.Owner for "To" address. What if we are missing that tag.owner. Can we introduce an absenteeism / alternate email address

Use Case: I have lots of missing tag policies where we know tag.owner is missing so logic-app cannot send the email. In that case, is there an option that I can embed either in policy or logic-app-workflow to send to AABC address upon absenteeism?

2) I want to call out the subscription name in the email when Logic-App sends an email. This will allow the user to differentiate quickly that which subscription has non-compliant items. Can I either introduce in the subject line or in the body? And How- can you please give the code? resource.displayName?

aakifshaikh commented 3 years ago

Graciously, I want to say "THANK YOU" for helping me in troubleshooting and get this successfully deployed. I have deployed this into Sandbox and right now writing all policies to see what works and what doesn't. Few Questions- 1) Since I am using the UserAssignedIdentity (UAI) in the policy code itself. Do I still need the Service Principal? 2) Upon running az-login - it does authenticate by opening a separate tab in the browser- so do we still need a service principal and an app in Azure Active Directory/App registrations? 3) Do you have a GitHub repo to see all your Azure Cloud Custodian Policies?

stefangordon commented 3 years ago

Since I am using the UserAssignedIdentity (UAI) in the policy code itself. Do I still need the Service Principal?

No you don't.

Upon running az-login - it does authenticate by opening a separate tab in the browser- so do we still need a service principal and an app in Azure Active Directory/App registrations?

No you will use that browser/cli auth locally for deployments, and the UAI for the functions to run. Don't need app registration.

Do you have a GitHub repo to see all your Azure Cloud Custodian Policies?

Unfortunately we do not have this yet. There is an example section under azure in the documentation site, and also the individual resources reference pages will also each have simple samples.

I want to call out the subscription name in the email when Logic-App sends an email. There is a list of available fields here: https://cloudcustodian.io/docs/azure/resources/index.html

For azure, the account_id will be the subscription ID, or resource.resourceGroup will be the group name, but we don't have the subscription display name. You can include these fields in the JSON you send to the logic app and use them as needed.

aakifshaikh commented 3 years ago

Logic-App: I tried to use other attributes / parameters from what was mentioned in the documentation here.

I include several attributes and they keep getting failed. please see the screenshot. -kind -type -service -account_id All of the above is giving error

Logic-App-Service

Logic-App-Type

Logic-App-Kind