Azure / azure-sdk-for-python

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.38k stars 2.71k forks source link

AzureAppConfigurationProvider should be able to lazy load secrets #36205

Open macieyng opened 3 weeks ago

macieyng commented 3 weeks ago

Is your feature request related to a problem? Please describe. Our application does use Key Vault and App Configuration. Some values in App Configuration are a reference to a Key Vault. When we initiate AzureAppConfigurationProvider with a load function then all of the values that are references to a Key Vault needs to go through SecretClient.get_secret. Problem is that in our use case we have over 70 values referenced and fetching all of them one by one at each refresh takes around 2 to 3 seconds and it's impacting performance in our app (see the screenshot below and everything will be clear). Or to be honest the whole microservice system, because services have dependencies on each other and are scaled, so even if we send request to the same service twice in the worst case we will loose 6 seconds on loading secrets that are not even required for specific API handler. Some of them will be required only for the application startup, so refreshing them even though they're not used doesn't make sense too.

Describe the solution you'd like Ideally secrets should be lazy loaded, when they're required or fetch in a batch that contains a set of values that are required and not all of the values that are in the Key Vault.

My solution can be found here: #36204

Describe alternatives you've considered Fetching all of the secrets from a Key Vault at once with SecretClient.get_secrets

Additional context

Screenshot 2024-06-21 at 22 45 11
github-actions[bot] commented 3 weeks ago

Thank you for your feedback. Tagging and routing to the team member best able to assist.

mrm9084 commented 3 weeks ago

@macieyng, I'll start taking a look at this. Just a clarification, do you actually need to load all of these secrets or are you just needing a sub set of them for a given provider. i.e. If you used a SettingSelector could you solve some of this by loading only the secrets you need?

macieyng commented 2 weeks ago

We use a single provider for the whole app - if that's an anti-pattern, then let us know and this should be added to docs somehow.

But on the other hand I can't imagine a system/pattern that we could group/split/partition our secrets, so multiple providers would make sense. We have things like db connection secrets, connection strings to cosmos, servicebus, API keys to external services and each of our microservices require some subset of those. The only way we could do it with current implementation is to use very specific selectors. But then we need to keep track of them, easy to forget to add one, etc. And new SDK users will need to learn about selectors early on, while you can give them a chance to not bother with learning about selectors and offer lazy loading out of the box.

From my experience selectors are fine for non-secret values, because it's easy to give them a structured key (<service>:<setting-group>:<specific-setting>), while for dbs and external apis there will likely be a single level of a structure (db:<database-name>, but we have a single database, api:<external-vendor-name>, but we have tens of vendors).

That's why I think lazy loading is the best choice for secrets - even from the security point of view: we respect principle of least privilege - because we only get those key that we actually use.

macieyng commented 2 weeks ago

I thought about a bit more. I think that depending on interpretation we might be breaking principle of least privilege. Therefore using selectors for each secret make sense.

Nevertheless, I do think that beside selectors lazy loading could be something useful and have positive impact on performance. I think it should be a default behaviour with an opt-out.

mrm9084 commented 2 weeks ago

@macieyng You might want to look into using labels for your secrets instead of doing everything by the name of the key. For example if you have cosmos secrets you can use cosoms or similar as the label. So, then it's clear what secrets you are retrieving.

This can also be helpful in your dev environments as you will not want to access prod resources but access similar items.

I agree that this is still a useful thing to look into adding.

miqm commented 2 weeks ago

I also agree that reducing amount of data fetched could be handled with a proper key organization and multiple selectors. However the usual pattern is that you have a central configuration store. With selector-per-usage approach you would need to maintain multiple instances of configuration provider which can not be easy. In general, the whole point of using various SDKs is to reduce complexity of your code and focus on the business value rather than reinventing the wheel again and again.

KeyVault operation is not only long but also incurs costs. with keyvault lazy loading this could be beneficial, especially that there's no sentinel key to watch for.

Additionally, for key references with secret id the entire process of refreshing could be skipped (not sure if it is now) as the value cannot change.