hashicorp / terraform-provider-azurerm

Terraform provider for Azure Resource Manager
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Mozilla Public License 2.0
4.47k stars 4.56k forks source link

azurerm_storage_data_lake_gen2_filesystem: datalakestore.Client#GetProperties: Failure responding to request: StatusCode=403 #6659

Open shadowmint opened 4 years ago

shadowmint commented 4 years ago

Community Note

Terraform (and AzureRM Provider) Version

$ terraform -v
Terraform v0.12.24
+ provider.azurerm v2.7.0
+ provider.random v2.2.1

Affected Resource(s)

Terraform Configuration Files

provider "azurerm" {
  version = "~> 2.7.0"
  features {}
}

provider "random" {
  version = "~> 2.2.0"
}

locals {
  resource_group_name = "rg-dev-test"
  storage_account_name = "devtest"
  location = "australiaeast"
}

resource "random_string" "unique_id" {
  length = 24 - length(local.storage_account_name)
  special = false
  upper = false
}

resource "azurerm_resource_group" "rg" {
  name = local.resource_group_name
  location = local.location
}

resource "azurerm_storage_account" "new_storage_account" {
  name = "${local.storage_account_name}${random_string.unique_id.result}"
  resource_group_name = azurerm_resource_group.rg.name
  location = local.location
  account_tier = "Standard"
  account_replication_type = "LRS"
  account_kind = "StorageV2"
  is_hns_enabled = "true"
  network_rules {
    default_action = "Allow"
  }
}

resource "azurerm_storage_data_lake_gen2_filesystem" "new_data_container" {
  name = "test-one"
  storage_account_id = azurerm_storage_account.new_storage_account.id
}

Debug Output

https://gist.github.com/shadowmint/3bc424a8fb2bba0415bd4ee67dfd8572

Panic Output

N/A

Expected Behavior

It should have worked.

Actual Behavior

Error: Error checking for existence of existing File System "test-one" (Account "devtestp672h8fwgdvcjsv8i"): datalakestore.Client#GetProperties: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: error response cannot be parsed: "" error: EOF

  on main.tf line 41, in resource "azurerm_storage_data_lake_gen2_filesystem" "new_data_container":
  41: resource "azurerm_storage_data_lake_gen2_filesystem" "new_data_container" {

Steps to Reproduce

  1. terraform apply

Important Factoids

References

This seems to be similar to this issue that was fixed in 2.1.0 for the azurerm_storage_account resource that was causing the same sort of issue: https://github.com/terraform-providers/terraform-provider-azurerm/pull/6050

It seems plausible from the diff that the fix applied there was not applied to the azurerm_storage_data_lake_gen2_filesystem, as the added tests only refer to the blob container type.

dougxwok2 commented 4 years ago

You can work around this by explicitly the assigning the role Storage Blob Data Contributor to SP or user on the parent resource group, using azurerm_role_assignment.

However, it's not clear if this is something wrong in the docs or if this is actually a bug; it seems like a bug, because the storage account access token should have superuser permission to add containers, even when this role is not assigned, and it can be done via the portal / powershell.

njuCZ commented 4 years ago

@shadowmint in your debug output file line 2121: This request is not authorized to perform this operation using this permission. I have tried your configuration file in my local, it can succeed without errors. Could you please double-check your permission ?

shadowmint commented 4 years ago

@njuCZ that is indeed the problem.

According to https://social.msdn.microsoft.com/Forums/en-US/7c8d58a4-04d1-4d0a-a9fa-c48c9991f1ab/azure-databricks-throwing-403-error?forum=AzureDatabricks

Only roles explicitly defined for data access permit a security principal to access blob or queue data. Roles such as Owner, Contributor, and Storage Account Contributor permit a security principal to manage a storage account, but do not provide access to the blob or queue data within that account.

...but we shouldn't be using the role to create the storage here, we should be using the access token.

So; yes, the issue is that the user isn't explicitly assigned the required Storage Blob Data Contributor role, but it isn't documented that that is a requirement (if it is supposed to be?), and I'm fairly certain it shouldn't be required.

I don't know how your account can be working without assigning that role, but it definitely does not work on mine.

njuCZ commented 4 years ago

@shadowmint yes, it actually needs the Storage Blob Data Contributor role. the doc of this resource only says requires the Storage specific roles, I will submit a small pr to update the doc

shadowmint commented 4 years ago

@njuCZ Are you sure this is the correct resolution?

The same operation (create container) can be performed via powershell without the role.

The example code doesn't do this, and please once again note the resolution in https://github.com/terraform-providers/terraform-provider-azurerm/pull/6050, which solves the almost identical problem for regular blob containers, https://github.com/terraform-providers/terraform-provider-azurerm/issues/6028 and https://github.com/terraform-providers/terraform-provider-azurerm/issues/5914.

I really think this the provider doing the wrong thing; you shouldn't have to assign a role to the SP on the resource group to create a container; you don't have to do that for a regular storage container, specifically because of the resolution in https://github.com/terraform-providers/terraform-provider-azurerm/pull/6050

The error in https://github.com/terraform-providers/terraform-provider-azurerm/issues/6028, specifically, is identical to the one I've listed here:

Error: Error reading static website for AzureRM Storage Account "diaga581e35e2d11ddd2c63a": accounts.Client#GetServiceProperties: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationPermissionMismatch" Message="This request is not authorized to perform this operation using this permission.\nRequestId:b5e2330d-c01e-000b-6cb6-f46bb7000000\nTime:2020-03-07T19:31:19.1613052Z" on main.tf line 36, in resource "azurerm_storage_account" "diagnostics": 36: resource "azurerm_storage_account" "diagnostics" {

tombuildsstuff commented 4 years ago

@shadowmint

I don't know how your account can be working without assigning that role, but it definitely does not work on mine.

Unfortunately the way the new Storage Roles have been rolled out is, unusual.

The "Contributor" role definition has been updated to include the new Storage Roles (from memory Data Contributor, but I may be wrong). Whilst new role assignments using these (e.g. "Contributor") roles include the new Storage Roles - existing role assignments haven't been updated to include this on Azure's side. Ultimately what this means is that two users/service principals can have the "Contributor" role assigned with different permissions.

I believe that's the root cause here, that the role assignment being used for the service principal here (Service Administrator) doesn't include these new permissions - as such you'd need to explicitly grant them. Alternatively you can add/remove the service principal from the original role which should also add this permission.

Unfortunately this is a behaviour of the Azure API, so there's not much we can do about that but document this requirement (it's also why we don't default to using Azure AD for Storage authentication by default) - but hopefully that helps explain that behaviour

shadowmint commented 4 years ago

@tombuildsstuff That's vexing, but at least it makes it bit easier to understand whats happening.

it's also why we don't default to using Azure AD for Storage authentication by default...

This kind of cuts to the heart of the issue here.

storage_use_azuread defaults to false. Explicitly setting it to false at the provider level still produces the same error.

So why is this operation attempting to use AD auth here?

What you've said totally makes sense, and if it's just the way it is, fair enough... but my expectation would be that unless I explicitly chose to opt in with provider "azurerm" { ... storage_use_azuread = true }, this would be a non-issue for me?

shadowmint commented 4 years ago

This is still broken in 2.11.0

For anyone else who finds this, I recommend you forget about using azurerm_role_assignment, because as per https://github.com/terraform-providers/terraform-provider-azurerm/issues/6934 there is an arbitrary and indefinite delay between requesting the role and it actually being active.

Rather, just use the az cli command to do this via the access key, as it should be implmented.

ie. Create a module like this:

variables.tf:

variable "existing_storage_account" {
  description = "The storage account to deploy to"
}

variable "create_container_name" {
  description = "The name of the storage resource to create"
}

main.tf:

# This is a workaround for the azure provider not working
# https://github.com/terraform-providers/terraform-provider-azurerm/issues/6934
# https://github.com/terraform-providers/terraform-provider-azurerm/issues/6659

resource "null_resource" "storage-container" {
  triggers = {
    build_number = timestamp()
  }
  provisioner "local-exec" {
    command = "echo `pwd` && sh main.sh"
    working_dir = path.module
    environment = {
      STORAGE_ACCOUNT_KEY = var.existing_storage_account.primary_access_key
      STORAGE_ACCOUNT_NAME = var.existing_storage_account.name
      CONTAINER_NAME = var.create_container_name
    }

  }
  depends_on = [
    var.existing_storage_account
  ]
}

main.sh:

#!/bin/sh
# Note: Requires Azure CLI >= 2.6.0
# see: https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-cli

# Query
printf "Query container state..."
AZ_RESPONSE=$(az storage fs exists \
  -n "$CONTAINER_NAME" \
  --account-key "$STORAGE_ACCOUNT_KEY" \
  --account-name "$STORAGE_ACCOUNT_NAME" \
  --query "exists" 2>&1)
printf "\n\nResponse:\n\n%s\n\n" "$AZ_RESPONSE"

# Parse response
if [ "$AZ_RESPONSE" = "true" ]; then
  AZ_CONTAINER_EXISTS=1
else
  AZ_CONTAINER_EXISTS=0
fi
printf "Exists: %d\n" "$AZ_CONTAINER_EXISTS"

# Create if missing
if [ "$AZ_CONTAINER_EXISTS" -eq "0" ]; then
  printf "\nCreate container: %s" "$CONTAINER_NAME"
  AZ_RESPONSE=$(az storage fs create \
    -n "$CONTAINER_NAME" \
    --account-key "$STORAGE_ACCOUNT_KEY" \
    --account-name "$STORAGE_ACCOUNT_NAME" \
    --public-access off \
    2>&1)
  printf "\n\nResponse:\n\n%s\n\n" "$AZ_RESPONSE"
else 
  printf "\nContainer exists. No action."
fi

and then invoke it like this:

module "lake-storage-analytics" {
  source = "./modules/whatever"
  existing_resource_group = "my-rg-name..."
  existing_storage_account = ...
  create_container_name = "analytics"
}

This is obviously horrible as it rebuilds the tfstate regardless of if any changes are done for the null provider, but it appears there is actually no alternative to this approach that actually works.

I reiterate:

lazywinadmin commented 4 years ago

Thanks for sharing @shadowmint ! FYI I still see the same issue in azurerm 2.18

LaurentLesle commented 3 years ago

Just ran the code above successfully with Terraform 13.2 and the following change:

terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 2.25.0"
    }
  }
  required_version = ">= 0.13"
}

image

tenderitaf commented 3 years ago

hi,

is it a terraform issue or azure provider @LaurentLesle ?

shadowmint commented 3 years ago

@LaurentLesle are you sure you don't simply have the storage role assigned to your account like njuCZ?

This still doesn't work, as I previously described, because it's not using the access token to create the storage container.

out

So... just to absolutely clear: No, this is not resolved in the 2.25.0 provider.

SDubrulle-e61 commented 3 years ago

Are their any plans to update the internals of the provider to fix this behavior?

mattew commented 3 years ago

Are there any updates to this issue? I still have the problem, running the following:

Terraform v0.14.4
+ provider registry.terraform.io/hashicorp/azurerm v2.42.0
joe-plumb commented 3 years ago

I am also still experiencing this behaviour

Terraform v0.14.5
+ provider registry.terraform.io/hashicorp/azurerm v2.44.0
vikascnr commented 3 years ago

1st solution You can add your machine IP to Firewall and Virtual networks explicitly from where you are executing this terraform script

It can be your local machine or it can be a DevOps self-hosted agent

In my case, my self hosted agent is part of the same virtual network which is allowed in Firewall and Virtual networks it is working perfectly

2nd solution (for POC purpose ) Change your storage account settings

  1. In Networking > Firewall and Virtual networks Allow access from all the network
  2. In configuration > Allow blob public access should be enabled

post this you can check access for Service Principal

@rlevchenko you are correct I have updated my comment

rlevchenko commented 3 years ago

@vikascnr it's not a solution. "disable firewall".. it's a kind of privilege for some).

Azure team should fix this "IP network rules have no effect on requests originating from the same Azure region as the storage account." in order to resolve the issue. ('cause the same behavior u have with pipelines in azure devops, for instance. I don't have any issues when the firewall is disabled though)

shadowmint commented 3 years ago

@tsukabon, if you read the comment history you’ll see this:

For anyone else who finds this, I recommend you forget about using azurerm_role_assignment, because as per #6934 https://github.com/terraform-providers/terraform-provider-azurerm/issues/6934 there is an arbitrary and indefinite delay between requesting the role and it actually being active.

I’ll also point out, again, that this is a bug.

The azure cli uses the auth token not AD to perform this operation, which is why it works without assigning that role.

Using AD is an option specified at the root level of the provider, and should not be used by default.

If what you have described works for you, thats great! However, be aware that in general it will fail due to timing issues assigning roles.

cheers~

On Sun, 28 Mar 2021 at 11:06 pm, tsukabon @.***> wrote:

@shadowmint https://github.com/shadowmint @joe-plumb https://github.com/joe-plumb @mattew https://github.com/mattew

This problem needs to be solved by setting up a built-in role(Storage Blob Data Contributor). The following is a sample terraform file.

data "azurerm_subscription" "primary" {}

resource "azurerm_role_assignment" "user" { scope = azurerm_storage_account.datalake.id role_definition_name = "Storage Blob Data Contributor" principal_id = data.azurerm_client_config.current.object_id }

resource "azurerm_storage_data_lake_gen2_filesystem" "example" { name = "dl2sample" storage_account_id = azurerm_storage_account.datalake.id depends_on = [azurerm_role_assignment.user] }

By the way, if you want to specify the role_definition_id

resource "azurerm_role_assignment" "user" { scope = azurerm_storage_account.datalake.id role_definition_id = format("%s/providers/Microsoft.Authorization/roleDefinitions/%s", data.azurerm_subscription.primary.id, "ba92f5b4-2d11-453d-a403-e96b0029c9fe") principal_id = data.azurerm_client_config.current.object_id }

https://docs.microsoft.com/en-us/azure/role-based-access-control/built-in-roles

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/terraform-providers/terraform-provider-azurerm/issues/6659#issuecomment-808909478, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACW3MFMONSSJ33XA5OQYH3TF5AZDANCNFSM4MSUQ6AA .

tsukashusan commented 3 years ago

@shadowmint Thank you for your message. I did not read the following comments in detail. I have deleted my comment. https://github.com/terraform-providers/terraform-provider-azurerm/issues/6659#issuecomment-634466197

robathija commented 3 years ago

@LaurentLesle are you sure you don't simply have the storage role assigned to your account like njuCZ?

This still doesn't work, as I previously described, because it's not using the access token to create the storage container.

out

So... just to absolutely clear: No, this is not resolved in the 2.25.0 provider.

Any idea when this will be fixed, I still have the same issue even with 2.59.0!

Elektry-On commented 3 years ago

Can confirm that this is still an issue with 2.60.0.

As a workaround I have given myself the "Storage Blob Data Contributor" Role on subscription level. After a while and relogin with az logout and az login it worked. Of course this is not the perfect solution.

dylanberry commented 3 years ago

Use terraform they said, Azure RM is a first class citizen! First thing I try...great.

DesaCh01 commented 3 years ago

In my case, was able to resolve this issue after adding Terraform Enterprise Subnet in storage account network rules code

resource "azurerm_storage_account_network_rules" "sa" { resource_group_name = module.resource_group.name storage_account_name = azurerm_storage_account.sa.name default_action = "Deny" bypass = ["AzureServices"] virtual_network_subnet_ids = [module.virtual_network.subnet["tfe_public"].id] }

jvanenckevort commented 2 years ago

This issue explicitly happens when:

Any progress on resolving this issue?

keshamin commented 2 years ago

Got the same thing :(

shmyer commented 2 years ago

My setup:

When allowing access to the storage account via our public proxy IP it succeeds, but when removing the public IP it fails. This leads me to think that the autorest client which is being used by azurerm_storage_data_lake_gen2_filesystem (and other resources!) is ignoring the NO_PROXY environment variable. Can anyone confirm this?

shmyer commented 2 years ago

I was able to resolve my issues by specifying more explicit values in the NO_PROXY environment variable: *.dfs.core.windows.net,*.file.core.windows.net,storageaccount0.blob.core.windows.net,storageaccount1.blob.core.windows.net

I could also confirm that wildcards in the middle of an entry do NOT work! Example: storageaccount0..core.windows.net Note that you should not add .blob.core.windows.net as an entry when using a storage account as terraform backend that is not accessed via privatelink!

Hope this helps someone

asos-andreireznikau commented 2 years ago

Can confirm that this is still an issue with 2.96.0 Any progress on resolving it?

cody-carlson commented 2 years ago

We are also seeing this issue on latest provider. Our scenario is using GitHub Actions on self-hosted runners and we're hitting all sorts of snags while enabling private endpoints on any resources, not just storage accounts.

Basis1977 commented 2 years ago

Error: retrieving Path "xx" in File System "" in Storage Account "": datalakestore.Client#GetProperties: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: error response cannot be parsed: "" error: EOF │ │ with module.data_lake.azurerm_storage_data_lakegen2, │ on ../../modules/azure/data_lake/main.tf line 99, in resource "azurerm_storage_data_lakegen2": │ resource "azurerm_storage_data_lake_gen2_path" "

AlexLudwigITDienstleistungen commented 2 years ago

Same Problem here with the latest provider: Error: checking if Blob "x" exists (Container "container" / Account "account" / Resource Group "rg"): blobs.Client#GetProperties: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: error response cannot be parsed: "" error: EOF

ken-tw commented 2 years ago

Hit this issue today. I have tried assigning Storage Blob Data Contributor to the SP but still get the error. I'm hoping there is just a delay before the role change takes effect.

The documentation says this...

image

But I do not know if this means I need to assign all these roles to my service principal? Some of them?

Very frustrating.

Prasanth-Sundarrajan commented 1 year ago

Same issue here..

gdubya commented 1 year ago

The solution will vary depending on your exact circumstances and configuration. In our case the error was caused by a Network Security Group on the subnet assigned to the private endpoint used by the storage account.

You can start by debugging the connectivity to the endpoint, for example try to curl the URL using both http and https.

On Tue, 26 Jul 2022, 18:02 Prasanth-Sundarrajan, @.***> wrote:

Same issue here.. why Microsoft team is not able give us correct solution for this ?

— Reply to this email directly, view it on GitHub https://github.com/hashicorp/terraform-provider-azurerm/issues/6659#issuecomment-1195675646, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADM4UIZH5A36WICI5OYTGDVWAD2PANCNFSM4MSUQ6AA . You are receiving this because you are subscribed to this thread.Message ID: @.*** com>

flavio-neves commented 1 year ago

For anyone coming from search engines. This error "datalakestore.Client#GetProperties" with azurerm_storage_data_lake_gen2_filesystem happens when you have firewall enabled on Storage Account.

image

Just add the resolving IP from machine running terraform to exclusion list or, the subnet if inside Azure environment.

aravindjal commented 1 year ago

For anyone coming from search engines. This error "datalakestore.Client#GetProperties" with azurerm_storage_data_lake_gen2_filesystem happens when you have firewall enabled on Storage Account.

image

Just add the resolving IP from machine running terraform to exclusion list or, the subnet if inside Azure environment.

In my case its been enabled for all networks. even then getting the same issue.

dre2004 commented 1 year ago

Is there a fix for this? Enabling public access to be able to create a container isn't really ideal.

davidhuser commented 1 year ago

The workaround on this StackOverflow post has helped me, which assigns the Storage Blob Data Owner role – although I had to add the dependency with depends_on on all other resources referencing the azurerm_storage_account resource.

ashishp-blueshift commented 1 year ago

In case this helps anyone else out there, I ran in this exact problem. Long story short, my lake was only exposted to internal Azure subnets. I just needed to add my local IP to the Firewall list and presto the container creation worked.

I would suggest there needs to be a better error message rather than the current 403; thats just me.

pavantejach7 commented 7 months ago

This is still an issue I tried all the above options, In my case the storage account is private so, I cannot enable network rules.

patrcoff-kainos commented 7 months ago

Exactly same issue as @pavantejach7 - what are you supposed to do when it is not allowed to have a public endpoint, even if restricted to just one IP?

rvdouderaa commented 7 months ago

We are seeing the same. After adding the ip of the deployment agent, it still isn't working. If the firewall is set to enabled from all networks it works fine. It looks like it is only necessary during the state refresh. (we manually set the firewall to allow all, so during the deployment it is set to specific networks, still keeps working).

komglebissarov commented 5 months ago

Hello.

I had the same problem. For me worked with adding a specific dfs private link instead of making connection public. Running terraform as a global admin, did not get any problems with permissions. Hope my lazy copypasta helps someone:

  1 module "storage-account" {
  2   source = "../storage-account"
  3 
  4   name = "meme"
  5 
  6   environment         = var.environment
  7   location            = var.location
  8   resource_group_name = var.resource_group_name
  9 
 10   endpoints = ["blob", "dfs"]
 11   
 12   account_kind             = "StorageV2"
 13   account_tier             = "Standard"
 14   account_replication_type = "LRS"
 15   
 16   privatelink_subnet_id = var.privatelink_subnet_id
 17   #public_network_access_enabled = true
 18 
 19   is_hns_enabled = true
 20   
 21   providers = {
 22     azurerm.infra = azurerm.infra
 23   }
 24   
 25   tags = {} #var.tags
 26 }
 27 
 28 resource "azurerm_storage_data_lake_gen2_filesystem" "main" {
 29   name               = "meme"
 30   storage_account_id = module.storage-account.id
 31 }

https://learn.microsoft.com/en-us/azure/private-link/private-endpoint-dns image