treeverse / lakeFS

lakeFS - Data version control for your data lake | Git for data
https://docs.lakefs.io
Apache License 2.0
4.46k stars 359 forks source link

Running lakeFS on AKS without passing credentials #2997

Open nopcoder opened 2 years ago

nopcoder commented 2 years ago

Support a way to deploy lakeFS with access to storage without passing key/secrets to the lakeFS container running inside AKS (Azure's Managed Kubernetes Service).

The current lakeFS implement doesn't work inside AKS cluster while configured with 'msi' (managed service identities) configured.

Based on https://docs.microsoft.com/en-us/azure/aks/operator-best-practices-identity, "Use Pod-managed Identities" will enable that.

Deploying AKS cluster based on https://docs.microsoft.com/en-us/azure/aks/operator-best-practices-identity#use-pod-managed-identities, lakefs fails to access the storage configured with user managed identity.

nopcoder commented 2 years ago

Example of an error we get while attempting to put an object to a storage configured using pod managed identity

===== INTERNAL ERROR =====
DefaultAzureCredential authentication failed
GET http://169.254.169.254/metadata/identity/oauth2/token
--------------------------------------------------------------------------------
RESPONSE 403 Forbidden
--------------------------------------------------------------------------------
failed to refresh token, error: adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not found"}

--------------------------------------------------------------------------------
nopcoder commented 2 years ago

Using the following packages, managed to write a test code to work with pod managed identity to access the storage:

    "github.com/Azure/azure-sdk-for-go/sdk/azidentity"
    "github.com/Azure/azure-sdk-for-go/sdk/storage/azblob"

Followed the Use managed identities in Azure Kubernetes Service instructions and using the Azure CLI to apply the pod level identity.

az aks pod-identity add -g <group> --cluster-name <cluster> --identity-resource-id <resource-id> --namespace <namespace> -n <name>

Current lakeFS does not use the above library and fail to start when configured to use 'msi' (managed service identities). There is no error in the process log, the pod gets into crash loop.

guy-har commented 1 year ago

@nopcoder was this fixed as part of upgrading the Azure SDK?

nopcoder commented 1 year ago

@nopcoder was this fixed as part of upgrading the Azure SDK?

sdk was updated - but the above doesn't tested using our helm chart. It was tested locally and not part of AKS.

github-actions[bot] commented 1 year ago

This issue is now marked as stale after 90 days of inactivity, and will be closed soon. To keep it, mark it with the "no stale" label.