Closed chrisreddington closed 1 year ago
As of today, the permissions documented at https://developer.hashicorp.com/vault/docs/secrets/azure#ms-graph-api-permissions are still incorrect. They were last updated in https://github.com/hashicorp/vault/pull/18937. The intent behind that PR was good, reducing the number of API Permissions required by the App Registration is a worthy goal.
The above PR missed one API Permission though and that is the Directory.ReadWrite.All
. I suspect that in the testing that @austingebauer did for the above PR, the engine was not configured to include any Vault Roles. Im betting that if Austin repeated the testing AND added Vault Roles to the engine that mapped to Azure Roles that the configuration would fail. Here is the output from our TFE that manages our Vault config for this engine:
Error: Error writing Azure Secret role "acl_cpt/cpt_XXXX/roles/00000000-0000-0000-0000-000000000000-contributor": Error making API request.
URL: PUT https://vault.nnnn.nnn/v1/acl_cpt/cpt_XXXX/roles/00000000-0000-0000-0000-000000000000-contributor
Code: 500. Errors:
* 1 error occurred:
* unable to lookup Azure role: authorization.RoleDefinitionsClient#List: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '00000000-0000-0000-0000-000000000000' with object id '00000000-0000-0000-0000-000000000000' does not have authorization to perform action 'Microsoft.Authorization/roleDefinitions/read' over scope '/subscriptions/00000000-0000-0000-0000-000000000000' or the scope is invalid. If access was recently granted, please refresh your credentials."
with vault_azure_secret_backend_role.this["cpt_XXXX.00000000-0000-0000-0000-000000000000-contributor"],
on secrets_engines.tf line 55, in resource "vault_azure_secret_backend_role" "this":
55: resource "vault_azure_secret_backend_role" "this" {
Once the Directory.ReadWrite.All
API Permission was added to the App Registration, Vault and the Azure Secrets Engine were able to configure the Vault Roles correctly.
We've opened a support case about this issue as well. Ref#114268.
For added valueconfusion, the tutorial pages for the Azure Secrets Engine list a much more extensive list of required API Permissions. See https://developer.hashicorp.com/vault/tutorials/secrets-management/azure-secrets?variants=vault-deploy%3Aoss#create-an-azure-service-principal-and-resource-group (steps 21 and 23).
Thanks for letting us know, @ausfestivus! We're clearly missing something for your use case. We'll get this figured out and update the documentation (+ tutorial pages).
In my testing, I did successfully create Vault roles that map to Azure roles. I was able to use those to generate credentials with success. Additionally, we have tests in the Azure secrets engine repository that use the permissions we've recommended. You can see in my pull request that our tests ran successfully.
Are you able to provide any more detail on your Vault role configuration? The error message you've shared is interesting. It seems to show that "The provided client secret keys for app '00000000-0000-0000-0000-000000000000' are expired." and not an authorization issue due to wrong permissions. Happy to help with more information!
"The provided client secret keys for app '00000000-0000-0000-0000-000000000000' are expired."
I may have inserted the wrong error above, let me revisit and confirm. Standby.
Yea, confirmed, ive inserted the wrong error message above. 🙄 Doing too many things at once. Ill fish-out and sanitise the actual error and update the above comment shortly.
Are you able to provide any more detail on your Vault role configuration?
I will provide what I can that may assist Hashicorp to refine their tests.
We use TFE to govern the Vault configuration. The Azure Secrets Engine configurations are defined in code and the configuration values are defined in a .tfvars
file. We have the following Resources defined in Terraform:
# azure secrets engine
resource "vault_azure_secret_backend" "this" {
for_each = var.azure_secret_backend
path = "${var.group_path_prefix}/${each.value.path}"
subscription_id = each.value.subscription_id
tenant_id = each.value.tenant_id
client_id = each.value.client_id
environment = "AzurePublicCloud"
use_microsoft_graph_api = true
}
# roles in the above secrets engine
resource "vault_azure_secret_backend_role" "this" {
for_each = {
for role in local.azure_role : "${role.backend_key}.${role.role_key}" => role
}
backend = vault_azure_secret_backend.this[each.value.backend_key].path
role = each.value.role
ttl = each.value.ttl
max_ttl = each.value.max_ttl
azure_roles {
role_name = each.value.azure_roles.role_name
scope = each.value.azure_roles.scope
}
}
From locals.tf
:
azure_role = flatten([
for backend_key, backend in var.azure_secret_backend : [
for role_key, role in backend.roles : {
backend_key = backend_key
role_key = role_key
backend = backend.path
role = role.name
ttl = role.ttl
max_ttl = role.max_ttl
azure_roles = role.azure_roles
}
]
])
And, the value of variable azure_secret_backend
with real values redacted:
azure_secret_backend = {
cpt_XXXX = {
path = "cpt_XXXX"
subscription_id = "00000000-0000-0000-0000-000000000000"
tenant_id = "00000000-0000-0000-0000-000000000000"
client_id = "00000000-0000-0000-0000-000000000000"
# roles for cpt_XXXX
roles = {
# Roles for Subscriptions with Dynamic SPs
"00000000-0000-0000-0000-000000000000-owner" = {
name = "00000000-0000-0000-0000-000000000000-owner"
ttl = 1200
max_ttl = 2400
azure_roles = {
role_name = "Owner"
scope = "/subscriptions/00000000-0000-0000-0000-000000000000"
}
}
"00000000-0000-0000-0000-000000000000-contributor" = {
name = "00000000-0000-0000-0000-000000000000-contributor"
ttl = 1200
max_ttl = 2400
azure_roles = {
role_name = "Contributor"
scope = "/subscriptions/00000000-0000-0000-0000-000000000000"
}
}
"00000000-0000-0000-0000-000000000000-reader" = {
name = "00000000-0000-0000-0000-000000000000-reader"
ttl = 1200
max_ttl = 2400
azure_roles = {
role_name = "Reader"
scope = "/subscriptions/00000000-0000-0000-0000-000000000000"
}
}
}
staticRoles = {}
}
}
If you're able to cobble together a prototype from the above for your test case, the repro steps are then:
Directory.ReadWrite.All
(Type: Application)When you think about it, it makes sense. Vault wants to use the AAD API to read
the Azure Role definitions. These exist in the DIrectory. Without at least Directory.Read.All
in the API Permissions then Vault wont be able to read the role definition in AAD. At least, thats what it looks like to me.
In the testing Ive done so far today, ive only completed the configuration of the Azure Secret Engine's configuration. The following MS Graph API Permissions were granted to the App Registration. This allowed the engine configuration to complete successfully.
Permission Name | Type |
---|---|
Application.ReadWrite.All | Application |
Directory.ReadWrite.All | Application |
Group.ReadWrite.All | Application |
My next step will be to scope out the work required to do the end-to-end testing of the engines full configuration and capability. I suspect that is a large body of work for us. I will be asked why Hashicorp shouldnt be the one's to do this work since it is their product. It's a fair question.
This isnt the first time we've encountered issues with the API Permissions required for the App Registration to function correctly. I do recall other issues/conversations about the Hashicorp documentation describing permissions requirements that could be considered too permissive. I know the question has been asked a few times internally here by the identity and security teams.
Quick Q @austingebauer do the tests for the Vault Role definition do the mapping via role_name
or role_id
or both?
Asking because, if you look at the doco here it states:
- If role_id is provided, it is validated and the corresponding role_name updated.
- If only role_name is provided, a case-insensitive search-by-name is made, succeeding only if exactly one matching role is found. The role_id field will updated with the matching role ID.
In our use-case, we are using the role_name
. This means that the engine needs to look the name up in AAD to retrieve the role_id
. If the app registration it uses to access AAD doesnt have at least Directory.Read.All
that look up will likely fail.
Thats my gut feel anyway.
The tests use role_name
. You can see this at path_service_principal_test.go#L942. You can see the Terraform config we're using to provide the configuration to the secrets engine in tests. It doesn't have the Directory.ReadWrite.All
permission but still works with role_name
.
Are you able to provide the actual error you're seeing? You mentioned the error you shared before was incorrect. I'm curious what the error looks like when you don't have Directory.ReadWrite.All
set. Thanks, we'll get it figured out!
Are you able to provide the actual error you're seeing?
I updated the error in the comment above to display the correct error.
Thanks @ausfestivus! It's still not clear to me why Directory.ReadWrite.All
is needed in your case vs. what we've been testing with 🤔 Do you happen to have any other ideas on what the difference might be?
I wonder if its related to default API Permissions in an AAD Tenant versus our AAD Tenants configuration. It's entirely possible that the AAD Tenant here has had a default config item changed. I will invite our identity team to engage on this issue and see what they can share.
I am still attempting to move this forward at our end.
Thanks, @ausfestivus! We have others looking at this internally as well. We're trying to understand the exact set of permissions to recommend and figure out why our tests pass without Directory.ReadWrite.All
.
Well, to make matters more interesting, we have removed the Directory.ReadWrite.All
API Permission from the App Registration. Things are working fine with that App Registration now. I dont know how or why this situation has come about.
I have detailed work journal entries for the whole thing. The order of events went something like this:
Directory.ReadWrite.All
API Permission to the App Registration. This was accepted and completed by the identity team.Following the above response from HC stating that the Engine's testing had passed with the described API Permissions in the doco:
Directory.ReadWrite.All
API Permission from the App Registration. This was accepted and completed by the identity team.I am now in the process of setting up an additional Engine for testing using a new App Registration supplied by the identity team. I want to see if the change which adds the Directory.ReadWrite.All
API Permission has any bearing on the result.
Additionally, while reviewing the work journal notes, ive noted something odd about the original error message:
│ * unable to lookup Azure role: authorization.RoleDefinitionsClient#List: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client '7e4c86b7-0000-0000-0000-000000000000' with object id '7e4c86b7-0000-0000-0000-000000000000' does not have authorization to perform action 'Microsoft.Authorization/roleDefinitions/read' over scope '/subscriptions/9588a465-0000-0000-0000-000000000000' or the scope is invalid. If access was recently granted, please refresh your credentials."
The object id
in the error is NOT the object id
from the App Registration. It is the object id
from the App Registration's Enterprise Application object in the directory. I cannot understand WHY the Vault engine would try and use the object id
for the Enterprise Application object instead of the App Registration object id
.
I may come back to consider this depending on the results of the testing with the new, 2nd App Registration.
I am unable to reproduce the original error with the 2nd App Registration.
My earlier assertion that the API Permissions listed in the Engine documentation at https://developer.hashicorp.com/vault/docs/secrets/azure#ms-graph-api-permissions is incorrect and should be ignored.
The API Permissions documentation in the Vault Learn portal here (as of today) are still different from the permissions described in the Engine's documentation above.
Thank you for the updates, @ausfestivus. I'm glad you were able to get things working with the permissions set we recommend. I will open a pull request to update the Vault learn material so that it matches our other Vault docs.
I'm going to close this issue as the Vault docs and learn tutorial will have matching permissions recommendations shortly. Thanks, everyone!
The Azure Secrets Engine API documentation (here) could include some additional information about the permissions needed to create Dynamic Secrets, AKA service principals. After hunting around, the required permissions are documented in Hashicorp Learn here.
It would be excellent to have the same steps on creating a service principal also in the documentation (just before writing to the azure/config), so that it becomes a quick and easy guide to setting up the Azure Secrets Engine.