Azure / azure-functions-host

The host/runtime that powers Azure Functions
https://functions.azure.com
MIT License
1.93k stars 440 forks source link

Storage Binding using Managed Identity instead of ConnectionString? #6423

Closed dedreira closed 2 years ago

dedreira commented 4 years ago

Is your question related to a specific version? If so, please specify:

What language does your question apply to? (e.g. C#, JavaScript, Java, All)

All

Question

Hi! Is there a possibility to use a blob storage binding against a blob storage using the Managed Service Identity of the Azure Function? I'm checking the documentation and there are only examples of defining the binding using the Storage Account Connection String....

Thanks

madushans commented 4 years ago

Not sure if this is supported for Storage Accounts in general. Your functions app does get Managed Service Identity, but Storage Accounts does not know how to accept and verify connections based on it I think.

Usually resources that support this has a Settings > Access Policies blade in portal which lets you configure which MSI is allowed to do what, for example, key vault resources have this but storage accounts dont.

Alternatively you can certainly create a connection string based on a SAS that would expire at a particular time or only has certain access rights. If you use the secondary key for this SAS, rotating that key can invalidate all SAS tokens signed by that key. I understand this is not the same as MSI, but this is the closest I can think of.

Keen to know if the above is wrong.

dedreira commented 4 years ago

@madushans it is supported for Storage Accounts, I have Managed Identity of my Function App activated, and I gave it "Storage Blob Data Contributor" role in the Storage Account.

You can configure RBAC for your Storage Account following this article if you want: https://docs.microsoft.com/en-us/azure/storage/common/storage-auth-aad-rbac-portal

I don't want to use a connection string with a SAS, I want to be passwordless if possible, as it is easier to maintain and I can benefit from RBAC capabilities.

madushans commented 4 years ago

@dedreira Sorry, didn't knew about that.

I dont think Storage in Functions natively support it.

i.e Key Vault SDK allows the connection string to be RunAs=App; to use this feature but storage doesn't.

dedreira commented 4 years ago

@madushans thanks a lot for your answer...unfortunately this is not what I was looking for.

As I wrote when I opened the Issue/Question, I was trying to use a "Storage Binding" against a Storage Account using a Managed Identity instead of a Connection String.

As you probably know, Azure Function Bindings provide a way of connecting with other Azure resources without the need of writing the high amount of code needed in other scenarios (App Service, for example).

https://docs.microsoft.com/en-us/azure/azure-functions/functions-triggers-bindings

So, I'm going to explain what am I trying to do,to clarify the purpose of this Issue/Question:

I'm writing a Function in python that's going to "listen" to a container in a Storage Account using a Storage Binding. When a new file is inserted in this container, I want the Function to be fired automatically.

The Azure Function Bindings Documentation says that to configure the input Storage trigger you need to specify a bunch of data, and one of these parameters is the Connection String of the Storage Account.:

https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob-trigger?tabs=csharp#configuration

(Please don't mind I have posted the C# version of the docs, as it's similar in python)

The point here is that I want to use the Managed Identity of the Function to configure the trigger and connect with the Storage Account, and get rid of the Storage Account connection string.

I'm using Azure DevOps Pipelines CI/CD capabilities to package and deploy the function into several environments, and I'm not comfortable with dealing with connection strings in my pipelines (also I'm not comfortable with saving these connection strings in the App Settings of the Function) because I don't want to (depend of / expose) the Storage Access Keys so, I think it would be great to be able to use a MSI to configure the Storage Binding.

As you have said in your comment, Key Vault SDK allows to use this feature, as I've read, also Event Hubs SDK. but sadly, this is not available for Storage Account SDK.

I would love if someone of the Azure Functions team or the Storage Account SDK Team could read this Issue and tell us their thoughts about this feature. 😄

brettsam commented 4 years ago

I think either @jeffhollan or @mattchenderson have discussed using Managed Identities with different bindings. I believe we first need support from the underlying SDK -- but maybe it's supported in newer versions of the Storage SDK? https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/identity/Azure.Identity#authenticating-a-service-principal-with-a-client-secret...

dedreira commented 4 years ago

@brettsam I know that it's possible to connect with a Storage Account from the Azure.Storage SDK using Managed Identities in C# and in python (as I've done it before with C# and now with python 😄) but the point here is that the storage binding to connect with the storage account only accepts a storage connection String. It would be great to have the possibility to choose the connection type:

@jeffhollan , @mattchenderson , @brettsam what do you think?

mattchenderson commented 4 years ago

Agreed that this is a goal, and we're in the process of working with the Azure SDK folks around this. I don't have an ETA yet.

brettsam commented 4 years ago

Moved this to backlog until we have a design and plan. Then we'll move it to a proper milestone.

pinkfloydx33 commented 3 years ago

Running Azure functions in docker containers inside of Kubernetes with Pod Identity (managed identity) is one place where this would be helpful. Right now I can configure Keda/autoscalar to use pod ID but I still have to managed the connection string for the binding itself which is quite unfortunate. It would definitely be nice for queue storage bindings to support managed ID so that all we need is account name and queue.

Styxxy commented 3 years ago

Any progress / indication when this might be available?

JohanKlijn commented 3 years ago

It would indeed be a nice feature to have, because currently all my function code is using managed identity except the bindings, which is a little bit unfortunate. So it would be nice if you good just specify https://mystorageacount.blob.core.windows.net/ as the connection string and Azure functions which create a client using managed identity.

An "inbetween" solution would be, if the binding is not only able to read the connection string from the configuration but also from a configured key-vault. That way the connection string is stored more secured and you have a central place for managing the connection strings.

cyberpion-yotam commented 3 years ago

I would also greatly appreciate this (in all bindings)! @JohanKlijn This is already possible in app settings - https://docs.microsoft.com/en-us/azure/app-service/app-service-key-vault-references

chrismcclure commented 3 years ago

Checking in to see if any progress has been made on this one. It would be nice to be able use Managed Identity for the function queue trigger, connection property. Right now, I just pull it from the config.

JohanKlijn commented 3 years ago

I would also greatly appreciate this (in all bindings)! @JohanKlijn This is already possible in app settings - https://docs.microsoft.com/en-us/azure/app-service/app-service-key-vault-references

I didn't know the vault-references, so thanks for the info!

adriennn commented 3 years ago

any update on this @brettsam ?

brettsam commented 3 years ago

This work has been moved to the Azure SDK, which has released 5.0.0-beta.1 that looks to have this functionality. You can see those release notes here: https://github.com/Azure/azure-sdk-for-net/blob/master/sdk/storage/Microsoft.Azure.WebJobs.Extensions.Storage.Blobs/CHANGELOG.md. There it says:

Added support for token credential authentication using Azure.Identity library, including support for managed identity and client secret credentials.

But I'm not sure whether that's supported in Functions. @paulbatum or @kasobol-msft should have more details.

sijucm commented 3 years ago

The documentation given below says it is possible: https://docs.microsoft.com/en-us/samples/azure-samples/functions-storage-managed-identity/using-managed-identity-between-azure-functions-and-azure-storage/

But knowing the inconsistencies in the Azure libraries I won't be surprised that the above documentation does not mean it is available in the Azure functions with NodeJS. (and it might even differ for the Windows or Linux servers and for any random reasons. It is all a rabbit hole with dead ends. Just use the connection string and live with it if it is not possible to move to another cloud)

paulbatum commented 3 years ago

See https://github.com/Azure/azure-webjobs-sdk/issues/2575 for related conversation.

We have work currently in progress to make sure managed identity works end-to-end for this scenario (and other scenarios, such as service bus, eventhubs, etc) in Azure Functions. It will work for all languages. These changes rely on the work that Brett highlighted above, but there is more to do. Expect some more news in this area within the next few months.

michc-msft commented 3 years ago

I'm also interested in this scenario. We have security guidance to auto-rotate keys for storage accounts, but that's not a super feasible option given you would either have to manually go in and rotate the connection string in each Azure Function you own or write a separate tool to automate this. Using Managed Identity would resolve this issue for us.

@paulbatum I didn't see any updates on the thread you posted, is there any knowledge of when this feature might be on the roadmap?

paulbatum commented 3 years ago

Yep the work is in progress and should be available for use in a couple of months (hopefully we'll be able to communicate more precisely about the expected GA date once we have a preview of the full end-to-end available).

michc-msft commented 3 years ago

@paulbatum do you also happen to know if Durable Function TaskHub bindings will be included in the shift towards Managed Identity?

paulbatum commented 3 years ago

@michc-msft I believe this is a separate work item that the durable team has on their radar. Not sure if they have a tracking issue for it in github though. I took a quick look here and couldn't find one.

michc-msft commented 3 years ago

Just opened a feature request. Thanks for the pointer @paulbatum!

amih90 commented 3 years ago

Any updated?

mattias-fjellstrom commented 3 years ago

@amih90 check this out https://devblogs.microsoft.com/azure-sdk/introducing-the-new-azure-function-extension-libraries-beta/ which seems to be related to this feature.

weerakoons commented 3 years ago

Yes, There is a way to access Storage with out a connection string. Please follow following steps 1 ) Edit your Python code with following changes

Import Azure Identity library

  from azure.identity import DefaultAzureCredential

  #Then, In your code use this code
   default_credential = DefaultAzureCredential()

   blob_service_client =BlobServiceClient(account_url="https://<Your Account>.blob.core.windows.net", credential=default_credential)
   # List the blobs in the container
   container_list = blob_service_client.list_containers()
   # Now with this blob_service_client you can do what ever you need.

2) Configure Your Portal for the Manage Identity.

paulbatum commented 3 years ago

Yesterday we announced the public preview of secretless functions scenarios: https://azure.microsoft.com/en-us/updates/public-preview-identitybased-connections-in-azure-functions-with-latest-azure-sdk-triggers-and-bindings/

This builds on the work that was linked above, where a number of functions extensions have been updated to follow the new Azure SDK guidelines, which includes support for managed identity.

The main documentation for using identity based connections is here: https://docs.microsoft.com/en-us/azure/azure-functions/functions-reference#configure-an-identity-based-connection

Please try this out. You should be able to successfully deploy a function app that scales dynamically (in the consumption or premium plans) that uses identity based connections for your azure storage based triggers (and also the other trigger types mentioned in the documentation), and similarly uses an identity based connection for AzureWebJobsStorage. You will need to keep using an access key based connection string for WEBSITE_CONTENTAZUREFILECONNECTIONSTRING because Azure Files does not currently support managed identity for SMB file share mounts. We recommend you use a key vault reference for this setting.

jesperkristensen commented 3 years ago

Do you have any idea if WEBSITE_CONTENTAZUREFILECONNECTIONSTRING will support managed identity in the near feature? E.g. if Azure Files will support it or if Azure Functions will move to a storage technology that supports it? It does not improve things much that we now only have one copy of the shared secret instead of two.

Key vault references do not work for us. Key vault references only allow system-assigned identities, but we only use user-assigned identities. Key vault references cache the secrets for up to one day, but when we regenerate keys, we regenerate the two keys less than half an hour apart from each other.

paulbatum commented 3 years ago

@jesperkristensen I'm not aware of any plan to enable identity based connections for SMB mounts in Azure Files (though I am not part of that team). However we are working on some changes and instructions for how to create a function app that doesn't need WEBSITE_CONTENTAZUREFILECONNECTIONSTRING - this would rely on using the run-from-package feature where it points at a package hosted in blob storage (which can be authenticated with an identity based connection), and foregoing a few small features (for example, no filesystem based logging). I expect this documentation should be published in the next 3-4 weeks, will post a link here when its available.

eLVas commented 3 years ago

@paulbatum

Yesterday we announced the public preview of secretless functions scenarios: https://azure.microsoft.com/en-us/updates/public-preview-identitybased-connections-in-azure-functions-with-latest-azure-sdk-triggers-and-bindings/

Is this preview also available for functions written in python? If so is there an example of how to set it up?

paulbatum commented 3 years ago

It is technically possible but more complicated to set up, because it requires not using bundles and instead manually referencing the new beta extensions (you would use these instructions). We are working on an updated 3.0-preview bundle release that would make it much easier to consume this feature from Python (and Java/JavaScript/etc). Let me try to find out how far away that is from being released.

paulbatum commented 3 years ago

An update on the last point discussed - it looks like we should have an updated extension bundle preview (containing the new libraries) published in the next two weeks.

jesperkristensen commented 3 years ago

I managed to delete the WEBSITE_CONTENTAZUREFILECONNECTIONSTRING from a function. It turns our that I also had to delete WEBSITE_CONTENTSHARE at the same time, or else I get an error message when saving the app settings saying that WEBSITE_CONTENTAZUREFILECONNECTIONSTRING is required (but the error message does not say anything about WEBSITE_CONTENTSHARE).

When I delete the above two settings, and change from AzureWebJobsStorage to AzureWebJobsStorage__accountName, my function can now run a TimerTrigger without any shared keys. However if I have a ServiceBusTrigger in addition to the TimerTrigger, then the ServiceBusTrigger will not execute. I cannot find any error messages anywhere. The trigger does just not do anything. I am using Microsoft.Azure.WebJobs.Extensions.ServiceBus version 4.3.0. I am still using a connection string with a shared key for the service bus, because I am not allowed to use beta or preview nuget packages. I am only using the managed identity for the AzureWebJobsStorage storage account. Any idea why this does not work?

paulbatum commented 3 years ago

I would suggest waiting for some documentation that explains how to create a new function app without WEBSITE_CONTENTAZUREFILECONNECTIONSTRING instead of modifying an existing app. I am not sure that modifying an existing app will work right.

jesperkristensen commented 3 years ago

Ok, I just saw that https://docs.microsoft.com/en-us/azure/azure-functions/storage-considerations#create-an-app-without-azure-files was recently added and thought it was ready for use. I will wait until I hear more.

paulbatum commented 3 years ago

@jesperkristensen Ahh thank you for the pointer, I didn't realize that content was completed and published. You should go ahead and try to get this working.

I ran through this myself today and I was able to get it working OK. But I did it by creating a new function app with no azure files settings. I did this with a modified ARM template.

SimonWahlin commented 3 years ago

This looks very promising!

It does say "You must deploy from an external package URL". Does this mean it will pull from this URL on each coldstart? Or does it have some hidden storage?

paulbatum commented 3 years ago

We have implemented some optimizations that allow us to skip pulling the full blob on every cold start. If you monitor your storage account activity you will see that the blob is read from time-to-time, so we still strongly recommend you make sure that the blob account hosting the package is in the same region as the function app. But yes, we are able to do faster cold starts by not pulling the package every time.

paulbatum commented 3 years ago

Hi folks, I did my best to document the steps I followed to get this scenario fully working. Hopefully this helps!

Creating a secretless Azure Function using managed identity

@jesperkristensen I think I might know what's missing in your scenario. For each connection that is used for a trigger, you need to specify that the credential type is managed identity. So here is a example of doing this for a service bus connection that is used for a queue trigger (from the walkthrough I linked above)

image

jesperkristensen commented 3 years ago

@paulbatum Deleting the function app before recreating it without WEBSITE_CONTENTAZUREFILECONNECTIONSTRING fixed the ServiceBusTrigger as you suggested.

However I also have a BlobTrigger, which fails. It logs this to Application Insights: The '"BlobTrigger"' function is in error: "Microsoft.Azure.WebJobs.Host: Error indexing method 'BlobTrigger'. Microsoft.Azure.WebJobs.Extensions.Storage: Storage account connection string 'AzureWebJobsStorage' does not exist. Make sure that it is a defined App Setting."

However the trigger is defined as [BlobTrigger("test-container/{name}", Connection = "StorageAccountConnectionString")] so I don't understand why it tries to access AzureWebJobsStorage. (StorageAccountConnectionString is still a connection string. It is only AzureWebJobsStorage that I tried to change to managed identity, since the docs say only managed identity for AzureWebJobsStorage is out of preview, but for BlobTrigger it is still only in preview)

paulbatum commented 3 years ago

@jesperkristensen Thanks for the info, I was able to repro, we're taking a look.

JonNieminen commented 3 years ago

It is technically possible but more complicated to set up, because it requires not using bundles and instead manually referencing the new beta extensions (you would use these instructions). We are working on an updated 3.0-preview bundle release that would make it much easier to consume this feature from Python (and Java/JavaScript/etc). Let me try to find out how far away that is from being released.

We're struggling to authenticate to blob storage/container using Azure Function's MSI in a Python code. The documentation is a bit confusing regarding the explicit extension installation - how does one make the storage SDK 5.x available in the function app and not only locally?

paulbatum commented 3 years ago

@jesperkristensen Sorry it took a bit of time to figure out - unfortunately switching the host over to use managed identity for AzureWebJobsStorage does not work if you continue to use the older storage extension (4.x). The reason for this is that the blob trigger uses a queue that is created in the AzureWebJobsStorage account, and that code doesn't know how to handle the identity based connection (because all that handling is in the 5.x extension). My recommendation is you plan on doing the full upgrade - wait until Microsoft.Azure.WebJobs.Extensions.Storage 5.x goes into general availability, and then update to it and switch everything over to use identity based connections.

paulbatum commented 3 years ago

Hi folks - we've published a preview extension bundle that you can use to try identity based connections in JavaScript, Java, PowerShell and Python without having to fiddle with explicit extension installation.

To use it, you'll need to modify your host.json:

  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle.Preview",
    "version": "[3.*, 4.0.0)"
  }

This should work both locally, and when deployed to Azure. Be aware that this extension bundle uses new major versions of the Azure SDKs for storage, eventhubs, servicebus and eventgrid, so its possible you'll encounter errors or behavior differences that due to breaking changes in those SDKs. If you're going to try this out, I recommend you do it first with a very simple application to confirm you got the steps right for using the bundle and setting up the identity based connections.

cc @eLVas @JonNieminen as you both asked about this specifically

eLVas commented 3 years ago

@paulbatum Thank you! I will try it out.

JonNieminen commented 3 years ago

Hi folks - we've published a preview extension bundle that you can use to try identity based connections in JavaScript, Java, PowerShell and Python without having to fiddle with explicit extension installation.

To use it, you'll need to modify your host.json:

  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle.Preview",
    "version": "[3.*, 4.0.0)"
  }

This should work both locally, and when deployed to Azure. Be aware that this extension bundle uses new major versions of the Azure SDKs for storage, eventhubs, servicebus and eventgrid, so its possible you'll encounter errors or behavior differences that due to breaking changes in those SDKs. If you're going to try this out, I recommend you do it first with a very simple application to confirm you got the steps right for using the bundle and setting up the identity based connections.

cc @eLVas @JonNieminen as you both asked about this specifically

Good stuff, tested it this morning and it works for our quite simple task to read a blob in and output it out!

jesperkristensen commented 3 years ago

Thanks for the update.

jfrosty commented 3 years ago

Can you please confirm if this should work for deployments from Visual Studio Code? During deployment I'm getting the error Malformed SCM_RUN_FROM_PACKAGE when uploading built content

I'm trying to deploy to a consumption Python 3.8 Function App. I updated host.json for ExtensionBundle.Preview, and I assigned Storage Blob Data Owner to the Function App's MI. I tried using AzureWebJobsStorage__accountName and AzureWebJobsStorage__serviceUri in the Function App's application settings, but neither worked.

paulbatum commented 3 years ago

I can confirm this worked from VS code for me - I used VS code to deploy the JavaScript function app that was using the preview bundle. However, I was deploying the content to an app running on Windows. I will try to do the same for a Python app running on Linux and see if I can reproduce your error.

paulbatum commented 3 years ago

@jfrosty I was able to reproduce the Malformed SCM_RUN_FROM_PACKAGE when uploading built content. We'll investigate.