Open leinad87 opened 2 months ago
Unless I am misinterpreting this PR, I think this is a documentation issue, not a feature support issue. It looks like Workload Identity is already implemented. I will be trying to get it working myself in the coming days. https://github.com/grafana/loki/pull/11802
Edit: Nevermind, I see that PR was specificly for the Loki Operator, not Loki itself. I'm not even sure where the relevant code is for this. I hope it's not this file, because that is using an old azure go library that has been effectively deprecated for more than a year now, so updating it will require more effort.
Good news!
Workload identity is actually supported already as of PR #7540 that I somehow missed before, it's just a poorly named option. The azure storage config value that you want is called useFederatedToken
in the loki
helm chart (use_federated_token
if setting in loki's config.yaml)
I have tested today and loki is able to successfully authenticate.
I had to add this to the helm values to get it to work:
loki:
podLabels:
azure.workload.identity/use: 'true'
yes, I assumed that was a given when using workload identity. There are a number of other things that need to be setup, like a "federated credential" azure resource, and enabling some settings on your azure kubernetes cluster to allow it to issue OIDC tokens. You'll also need a label and annotation on the service account:
serviceAccount:
annotations:
azure.workload.identity/client-id: 'your-client-id-here'
labels:
azure.workload.identity/use: 'true'
To be fair, not everything requires the podLabels
. external-dns
does, but external-secrets-operator
works with only the serviceAccount
annotation as an example.
Tried this tonight, but getting a panic from ADAL(!?)
The workload identity token is added to the pod with the usual AZURE_ env variables...
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1bb19b9]
goroutine 1 [running]:
github.com/Azure/go-autorest/autorest/adal.(*ServicePrincipalToken).SetCustomRefreshFunc(...)
/src/loki/vendor/github.com/Azure/go-autorest/autorest/adal/token.go:418
github.com/grafana/loki/v3/pkg/storage/chunk/client/azure.(*BlobStorage).getServicePrincipalToken(0xc001706e00, {0x2b87c90?, 0x2b87c98?})
/src/loki/pkg/storage/chunk/client/azure/blob_storage_client.go:441 +0x279
github.com/grafana/loki/v3/pkg/storage/chunk/client/azure.(*BlobStorage).getOAuthToken(0xc001706e00)
/src/loki/pkg/storage/chunk/client/azure/blob_storage_client.go:393 +0xe6
github.com/grafana/loki/v3/pkg/storage/chunk/client/azure.(*BlobStorage).newPipeline(0xc001706e00, {0xee6b280, 0x3, 0x14}, 0x0)
/src/loki/pkg/storage/chunk/client/azure/blob_storage_client.go:377 +0x265
github.com/grafana/loki/v3/pkg/storage/chunk/client/azure.NewBlobStorage(0xc000676a10, {0xc000814040?, {0x325bfe8?, 0xc000bd6420?}}, {0xc000c75080?, 0xc001798698?, 0x4105e5?})
/src/loki/pkg/storage/chunk/client/azure/blob_storage_client.go:201 +0x111
github.com/grafana/loki/v3/pkg/ruler/base.NewLegacyRuleStore({{0xc00082c560, 0x5}, {{0x2a7e063, 0xb}, {0xc000c99c50, 0x11}, {{0x0, 0x0}}, {0x0, 0x0}, ...}, ...}, ...)
/src/loki/pkg/ruler/base/storage.go:99 +0x325
github.com/grafana/loki/v3/pkg/loki.(*Loki).initRulerStorage(0xc001708000)
/src/loki/pkg/loki/modules.go:1153 +0x2fb
github.com/grafana/dskit/modules.(*Manager).initModule(0xc000aa43a8, {0x7ffd9c975324, 0x3}, 0x1?, 0xc0017000c0?)
/src/loki/vendor/github.com/grafana/dskit/modules/modules.go:136 +0x1f7
github.com/grafana/dskit/modules.(*Manager).InitModuleServices(0x0?, {0xc00066e0c0, 0x1, 0xc000cc8180?})
/src/loki/vendor/github.com/grafana/dskit/modules/modules.go:108 +0xd8
github.com/grafana/loki/v3/pkg/loki.(*Loki).Run(0xc001708000, {0x0?, {0x4?, 0x3?, 0x4912940?}})
/src/loki/pkg/loki/loki.go:453 +0x9d
main.main()
/src/loki/cmd/loki/main.go:122 +0x113b
```yaml monitoring: serviceMonitor: enable: true loki: podLabels: azure.workload.identity/use: 'true' commonConfig: path_prefix: /var/loki replication_factor: 3 compactor_address: '{{ include "loki.compactorAddress" . }}' schemaConfig: configs: - from: "2024-04-01" store: tsdb index: prefix: loki_index_ period: 24h object_store: azure schema: v13 storage: type: azure bucketNames: chunks: loki-chunks # Needs to pre-exist ruler: loki-rulers # Needs to pre-exist admin: loki-admin # Needs to pre-exist azure: accountName: redacted userAssignedId: "redacted" endpointSuffix: blob.core.windows.net useFederatedToken: true singleBinary: replicas: 0 extraArgs: - "-config.expand-env=true" deploymentMode: SingleBinary ```
@Richard87 Indeed, I came to this issue board to raise an issue on this when I saw this thread already open.
When using the useManagedIdentity: true
flag an error appears from adal
:
adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not found"} Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=<redacted-clientId>&resource=https%3A%2F%2F<redacted-storage-account-name>.blob.core.windows.net
This is quite concerning as the adal
library has been deprecated for over a year now.
This issue is about workload identity, not managed identity. While workload identity makes use of managed identities under the hood, it is a different authentication method.
I'm not a Loki maintainer, but I suspect it would be better to create a new issue either about managed identity authentication, or more generally about the deprecated ADAL library, which might get more attention. When Loki is updated to the Azure.Identity
sub-package of the Azure SDK for Go it can make use of the DefaultAzureCredential
which will auto-detect which authentication method you are using from a number of different options, including both Managed Identity and Workload Identity.
These are the relevant helm values I am using successfully for Workload Identity on Loki Helm Chart 6.5.2
loki:
podLabels:
"azure.workload.identity/use": "true"
storage:
type: "azure"
azure:
accountName: "redacted"
useFederatedToken: true
bucketNames:
chunks: "loki-chunk-store"
ruler: "loki-ruler-store"
serviceAccount:
annotations:
"azure.workload.identity/client-id": "redacted"
labels:
"azure.workload.identity/use": "true"
Thanks, I added azure.workload.identity/client-id
annotation manually to the service account, and it stopped panicking!
(And when creating the bucket names in the correct storage account, everything worked great! 🙏 )
Is your feature request related to a problem? Please describe. To configure Azure storage, there are some options like managed identities or service principal. I don't like managed identities becase they can be used by any pod in the cluster. If you think about service principal, SAS or blob storage token, they have sensitive data that should be managed carefully and secrets are not easily configured: https://github.com/grafana/loki/issues/9143
Describe the solution you'd like Azure has introduce Federated Identity Credentials. This will allow to sync a Service Account with a Azure application with required permisions without passwords.
Could be possible to implement Azure Federated Workload Identity to authenticate Loki against Azure Storage Account?