databricks / databricks-sdk-go

Databricks SDK for Go
https://docs.databricks.com/dev-tools/sdk-go.html
Apache License 2.0
49 stars 41 forks source link

[FEATURE] Support for Azure AD Workload Identities #566

Open agchang opened 1 year ago

agchang commented 1 year ago

Problem Statement As far as I can tell, databricks-sdk-go supports managed identities, but not Azure AD Workload Identities.

Proposed Solution Perhaps we could integrate with https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity

Additional Context We are trying to deploy an Azure Databricks instance with a workload identity. This is supposed to supersede what was previously know as "pod identity", which allows more fine-grained associations of managed identities to K8S workloads. The process of obtaining a token is different than regular managed identities. There is no communication with the Azure Instance Metadata Service, i.e. IMDS, which is what databricks-sdk-go does when using AzureUseMsi: true is set.

nfx commented 1 year ago

Could you link to protocol description? We don’t really want to depend on any external liberties, as much as possible. Simply because of transitive dependency tree size.

agchang commented 1 year ago

https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview#how-it-works

NathanNZ commented 11 months ago

This functionality could help considerably when building secure GitHub pipelines. (https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-azure)

While reducing transitive dependencies is a good general practice, I believe the security benefits and velocity improvements of using cloud identity specific libraries to support would be worth considering. From a customer-centric viewpoint it would allow your customers to take advantage of best practice security practices without having to wait months or years for cloud vendor specific functionality to be implemented. From the development side it would drastically reduce the amount of custom identity logic you'd need to maintain yourself, and the ability to provide features faster when they are released upstream.