Open daleksandrowiczgd opened 10 months ago
@daleksandrowiczgd can you share what your kubeconfig looks like after kubelogin convert-kubeconfig -l spn
? you only need to capture the exec plugin part:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
command: kubelogin
args:
- get-token
- --environment
- AzurePublicCloud
- --server-id
- <AAD server app ID>
- --client-id
- <AAD client app ID>
- --tenant-id
- <AAD tenant ID>
besides, in your repro step you seem to miss -l spn
? not sure if this is intentional or not?
Convert kubeconfig with
kubelogin convert-kubeconfig
command and Service Principal parameters specified as arguments--client-id <AZURE_CLIENT_ID> --client-secret <AZURE_CLIENT_SECRET> --tenant-id <AZURE_TENANT_ID>
can you also share the actual relevant environment variables in your runner environment? There is no reference to AZURE_FEDERATED_CREDENTIALS
in any repo I can find.
besides, in your repro step you seem to miss -l spn? not sure if this is intentional or not?
Non intentional, I missed it this in my message, so please just assume it should be there.
can you also share the actual relevant environment variables in your runner environment? There is no reference to AZURE_FEDERATED_CREDENTIALS in any repo I can find.
Actually, I just mentioned all environment variables set by Workload Identity, but it seems like only the AZURE_CLIENT_ID
variable generates conflicts for kubelogin
tool.
can you share what your kubeconfig looks like after kubelogin convert-kubeconfig -l spn? you only need to capture the exec plugin part:
Yes, let me even share you few cases with exact commands I run and results of the kubeconfig file after running each of them.
In all cases the following env variables will be set (I will use the same name to identify the same values in kubeconfig):
export SP_CLIENT_ID="<SP_CLIENT_ID>" # Client (Application) ID of the Service Principal
export SP_CLIENT_SECRET="<SP_CLIENT_SECRET>" # Client secret of the Service Principal
export SP_TENANT_ID="<SP_TENANT_ID>" # Tenant ID of the Service Principal
export CLUSTER_RG="<CLUSTER_RG>" # AKS cluster resource group
export CLUSTER_NAME="<CLUSTER_NAME>" # AKS cluster name
export CUSTOM_KUBECONFIG_PATH="${HOME}/.kube/config" # Path to the kubeconfig
Additionally, for better understanding of the output I will specify WI_CLIENT_ID
(Workload Identity client ID), instead of the real value.
az login --service-principal -u "${SP_CLIENT_ID}" -p "${SP_CLIENT_SECRET}" -t "${SP_TENANT_ID}"
az aks get-credentials --resource-group "${CLUSTER_RG}" --name "${CLUSTER_NAME}" --file "${CUSTOM_KUBECONFIG_PATH}"
kubelogin convert-kubeconfig --login "spn" --kubeconfig "${CUSTOM_KUBECONFIG_PATH}" --client-id "${SP_CLIENT_ID}" --client-secret "${SP_CLIENT_SECRET}" --tenant-id "${SP_TENANT_ID}"
Kubeconfig:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
args:
- get-token
- --login
- spn
- --server-id
- 6dae42f8-4368-4678-94ff-3960e28e3630
- --client-id
- ${WI_CLIENT_ID}
- --tenant-id
- ${SP_TENANT_ID}
- --environment
- AzurePublicCloud
- --client-secret
- ${SP_CLIENT_SECRET}
command: kubelogin
env: null
installHint: |2
kubelogin is not installed which is required to connect to AAD enabled cluster.
To learn more, please go to https://aka.ms/aks/kubelogin
provideClusterInfo: false
$ kubectl get pod --kubeconfig ${CUSTOM_KUBECONFIG_PATH}
# RESPONSE 401 Unauthorized
# --------------------------------------------------------------------------------
# {
# "error": "invalid_client",
# "error_description": "AADSTS7000215: Invalid client secret provided. Ensure the secret being sent in the request is the client secret value, not the client secret ID, for a secret added to app '${WI_CLIENT_ID}'. Trace ID: 1d7b8101-9172-4b1e-8ac6-ac840f666301 Correlation ID: 9d204593-95f3-460e-81e6-fb6a94e4d959 Timestamp: 2024-01-03 10:45:10Z",
# "error_codes": [
# 7000215
# ],
# "timestamp": "2024-01-03 10:45:10Z",
# "trace_id": "1d7b8101-9172-4b1e-8ac6-ac840f666301",
# "correlation_id": "9d204593-95f3-460e-81e6-fb6a94e4d959",
# "error_uri": "https://login.microsoftonline.com/error?code=7000215"
# }
# --------------------------------------------------------------------------------
# To troubleshoot, visit https://aka.ms/azsdk/go/identity/troubleshoot#client-secret
# Unable to connect to the server: getting credentials: exec: executable kubelogin failed with exit code 1
Conclusion: --client-id
argument in kubeconfig was overriden by Workload Identity. kubectl
command doesn't work correctly - return 401 error.
az login --service-principal -u "${SP_CLIENT_ID}" -p "${SP_CLIENT_SECRET}" -t "${SP_TENANT_ID}"
az aks get-credentials --resource-group "${CLUSTER_RG}" --name "${CLUSTER_NAME}" --file "${CUSTOM_KUBECONFIG_PATH}"
AZURE_CLIENT_ID="${SP_CLIENT_ID}" kubelogin convert-kubeconfig --login "spn" --kubeconfig "${CUSTOM_KUBECONFIG_PATH}" --client-id "${SP_CLIENT_ID}" --client-secret "${SP_CLIENT_SECRET}" --tenant-id "${SP_TENANT_ID}"
Kubeconfig:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
args:
- get-token
- --login
- spn
- --server-id
- 6dae42f8-4368-4678-94ff-3960e28e3630
- --client-id
- ${SP_CLIENT_ID}
- --tenant-id
- ${SP_TENANT_ID}
- --environment
- AzurePublicCloud
- --client-secret
- ${SP_CLIENT_SECRET}
command: kubelogin
env: null
installHint: |2
kubelogin is not installed which is required to connect to AAD enabled cluster.
To learn more, please go to https://aka.ms/aks/kubelogin
provideClusterInfo: false
$ kubectl get pod --kubeconfig ${CUSTOM_KUBECONFIG_PATH}
RESPONSE 401 Unauthorized
--------------------------------------------------------------------------------
{
"error": "invalid_client",
"error_description": "AADSTS7000215: Invalid client secret provided. Ensure the secret being sent in the request is the client secret value, not the client secret ID, for a secret added to app '${WI_CLIENT_ID}'. Trace ID: 854ccce1-bdf3-4122-8453-8e3ba73f9700 Correlation ID: 1371d7e8-053f-4f32-8f7b-9859bc27f635 Timestamp: 2024-01-03 11:00:37Z",
"error_codes": [
7000215
],
"timestamp": "2024-01-03 11:00:37Z",
"trace_id": "854ccce1-bdf3-4122-8453-8e3ba73f9700",
"correlation_id": "1371d7e8-053f-4f32-8f7b-9859bc27f635",
"error_uri": "https://login.microsoftonline.com/error?code=7000215"
}
--------------------------------------------------------------------------------
To troubleshoot, visit https://aka.ms/azsdk/go/identity/troubleshoot#client-secret
Unable to connect to the server: getting credentials: exec: executable kubelogin failed with exit code 1
Conclusion: --client-id
argument in kubeconfig got correct Client ID of the Service Principal. kubectl
command doesn't work correctly - return 401 error, again with the Workload Identity client ID in the error (WI_CLIENT_ID
).
az login --service-principal -u "${SP_CLIENT_ID}" -p "${SP_CLIENT_SECRET}" -t "${SP_TENANT_ID}"
az aks get-credentials --resource-group "${CLUSTER_RG}" --name "${CLUSTER_NAME}" --file "${CUSTOM_KUBECONFIG_PATH}"
AZURE_CLIENT_ID="${SP_CLIENT_ID}" kubelogin convert-kubeconfig --login "spn" --kubeconfig "${CUSTOM_KUBECONFIG_PATH}" --client-id "${SP_CLIENT_ID}" --client-secret "${SP_CLIENT_SECRET}" --tenant-id "${SP_TENANT_ID}"
Kubeconfig:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
args:
- get-token
- --login
- spn
- --server-id
- 6dae42f8-4368-4678-94ff-3960e28e3630
- --client-id
- ${SP_CLIENT_ID}
- --tenant-id
- ${SP_TENANT_ID}
- --environment
- AzurePublicCloud
- --client-secret
- ${SP_CLIENT_SECRET}
command: kubelogin
env: null
installHint: |2
kubelogin is not installed which is required to connect to AAD enabled cluster.
To learn more, please go to https://aka.ms/aks/kubelogin
provideClusterInfo: false
$ AZURE_CLIENT_ID="${SP_CLIENT_ID}" kubectl get pod --kubeconfig ${CUSTOM_KUBECONFIG_PATH}
No resources found in default namespace.
Conclusion: --client-id
argument in kubeconfig got correct Client ID of the Service Principal. kubectl
works correctly, because AZURE_CLIENT_ID
was overriden only during running this command.
az login --service-principal -u "${SP_CLIENT_ID}" -p "${SP_CLIENT_SECRET}" -t "${SP_TENANT_ID}"
az aks get-credentials --resource-group "${CLUSTER_RG}" --name "${CLUSTER_NAME}" --file "${CUSTOM_KUBECONFIG_PATH}"
kubelogin convert-kubeconfig --login "spn" --kubeconfig "${CUSTOM_KUBECONFIG_PATH}" --client-id "${SP_CLIENT_ID}" --client-secret "${SP_CLIENT_SECRET}" --tenant-id "${SP_TENANT_ID}"
Kubeconfig:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
args:
- get-token
- --login
- spn
- --server-id
- 6dae42f8-4368-4678-94ff-3960e28e3630
- --client-id
- ${WI_CLIENT_ID}
- --tenant-id
- ${SP_TENANT_ID}
- --environment
- AzurePublicCloud
- --client-secret
- ${SP_CLIENT_SECRET}
command: kubelogin
env: null
installHint: |2
kubelogin is not installed which is required to connect to AAD enabled cluster.
To learn more, please go to https://aka.ms/aks/kubelogin
provideClusterInfo: false
$ AZURE_CLIENT_ID="${SP_CLIENT_ID}" kubectl get pod --kubeconfig ${CUSTOM_KUBECONFIG_PATH}
No resources found in default namespace.
Conclusion: --client-id
argument in kubeconfig was overriden by Workload Identity. kubectl
works correctly, because AZURE_CLIENT_ID
was overriden only during running this command.
So as you can see, always the AZURE_CLIENT_ID
variable takes precedence over CLI arguments defined in the kubeconfig:
--client-id
was put inside the kubeconfig (Workload Identity one), you can override AZURE_CLIENT_ID
variable only during running kubectl/helm command, but this is not a deal.--client-id
was put inside the kubeconfig (Service Principal one), the kubectl/helm command will fail with 401 error, what gives no possibility to use Service Principal, when there is already Workload Identity injected into the Pod.Hope everything is understandable.
@daleksandrowiczgd does directly overriding the env
field help as a workaround?
exec:
apiVersion: client.authentication.k8s.io/v1beta1
args:
- get-token
- ...
command: kubelogin
env:
- name: "AZURE_CLIENT_ID"
value: "value for client ID"
what is your environment? do you know how AZURE_CLIENT_ID
is set?
@weinong, our environment:
We migrated from deprecated AAD Pod Identity to the Workload Identity. In our Gitlab CI jobs we want to give us the opportunity to switch between Service Principal or Managed Identity credentials, if needed (e.g. sometimes we have issues with Azure API throttling, so we need to have the option to use SP in such cases). After migration to Workload Identity, we encountered the problem that I described in the issue description.
In general, how the runners config related to the Workload Identity looks like and how those variables are injected:
ServiceAccount
with the annotation azure.workload.identity/client-id: <SP_CLIENT_ID>
(<SP_CLIENT_ID>
- client ID of the Service Principal)serviceAccount: <SA_NAME>
(<SA_NAME>
- the name of the above ServiceAccount)azure.workload.identity/use: "true"
AZURE_CLIENT_ID
interfere with kubelogin) :
AZURE_CLIENT_ID
AZURE_TENANT_ID
AZURE_FEDERATED_TOKEN_FILE
AZURE_AUTHORITY_HOST
Right now after migration to Workload Identity, it's impossible to use Service Principal nor Managed Identity with different client ID, because AZURE_CLIENT_ID
environment variable injected by Workload Identity always takes precedence in the kubelogin
commands.
@enj yes, this workaround works, thanks for an idea. But we still need to modify every pipeline and script to manually update the kubeconfig file everywhere we want to use the Service Principal credentials.
Maybe it will be good to add AZURE_CLIENT_ID
env variable to this exec
part in the kubeconfig file, when Service Principal is chosen in the kubelogin convert-kubeconfig
command (--login "spn"
), wdyt?
Or even better will be to use the --client-id
CLI argument that is passed to the get-token
command inside the kubeconfig.
We just hit this issue in Azure DevOps too, while trying to migrate our self-hosted runners to use workload identity instead of aadpodidentity, and on a quick skim of the docs I don't see any way to manipulate the environment for these commands to implement a similar workaround.
In general the idea that environment variables override provided cli arguments is quite surprising, most tooling I'm used to interacting with uses envionment variables as a fallback if cli arguments aren't provided. If the kubelogin tool used the environment as a fallback for missing cli arguments then there would be no problems here at all.
Problem
We have been encountering the issue in our pipelines, when we try to run
kubectl
commands in self-managed runners.When we have Workload Identity variables in our Pod (
AZURE_CLIENT_ID
,AZURE_CLIENT_SECRET
,AZURE_FEDERATED_CREDENTIALS
), we've no choice to use another type of Azure AD object like Service Principal, even if we have converted the kubeconfig by usingkubelogin convert-kubeconfig
with Service Principal credentials.When we migrated from deprecated AAD Pod Identity, we had to do many workarounds to not interrupt working of each of our pipelines/scripts. For example, we had to override the
AZURE_CLIENT_ID
variable only for the time of runningkubectl
command, which is redundant, time consuming and not convenient.We noticed that even if you are logged in via
az login
with SP credentials, running theaz aks get-credentials
and thenkubelogin convert-kubeconfig -l spn
with Service Principal arguments--client-id <AZURE_CLIENT_ID> --client-secret <AZURE_CLIENT_SECRET> --tenant-id <AZURE_TENANT_ID>
, the kubeconfig file will anyway get Workload Identity client ID in thekubelogin get-token
command as the CLI argument.Moreover, even if we override
AZURE_CLIENT_ID
to use SP one only during runningkubelogin convert-kubeconfig -l spn
command, the next kubectl command run still uses the Workload Identity.AZURE_CLIENT_ID
environment variable always takes precedence over cli arguments - I don't know if it's intended, but for sure is not obvious.There should be an option to authenticate to the AKS cluster by using Service Principal, even if Workload Identity's variables are injected into the Pod. The fact that the command line arguments in the kubeconfig are always overriden by
AZURE_CLIENT_ID
variable doesn't give us any flexibility.How to reproduce
Prerequisites
Steps
Run the Gitlab job which will do the following:
az login
as the Service Principalaz aks get-credentials
commandkubelogin convert-kubeconfig
command and Service Principal parameters specified as arguments--client-id <AZURE_CLIENT_ID> --client-secret <AZURE_CLIENT_SECRET> --tenant-id <AZURE_TENANT_ID>
kubectl
command (best to run against some Kubernetes resource the SP has and WI doesn't have access to, so you will see the object id of the logged in AAD identity)Expected output
kubectl
command runs with Service Principal credentialsReal output
kubectl
command runs with Workload Identity credentials