kubernetes-sigs / cluster-api-provider-azure

Cluster API implementation for Microsoft Azure
https://capz.sigs.k8s.io/
Apache License 2.0
292 stars 419 forks source link

could not get instance metadata on Windows node #2132

Closed andyzhangx closed 1 week ago

andyzhangx commented 2 years ago

/kind bug

[Before submitting an issue, have you checked the Troubleshooting Guide?]

What steps did you take and what happened: [A clear and concise description of what the bug is.]

What did you expect to happen:

I set up a capz cluster with Windows Server 2019 Datacenter node and also installed CSI driver on windows node, and CSI driver could not get instance metadata on Windows node

# kubectl get no -o wide
NAME                              STATUS   ROLES                  AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION     CONTAINER-RUNTIME
capz-d0un-gqnn8                   Ready    <none>                 23m   v1.22.1   10.1.0.6      <none>        Windows Server 2019 Datacenter   10.0.17763.2237    containerd://1.6.0-beta.0
I0228 11:31:21.570720    3008 utils.go:77] GRPC call: /csi.v1.Node/NodeGetInfo
I0228 11:31:21.570720    3008 utils.go:78] GRPC request: {}
W0228 11:31:42.573798    3008 nodeserver.go:337] get zone(capz-jn2u-8j6rb) failed with: Get "http://169.254.169.254/metadata/instance?api-version=2019-03-11&format=json": dial tcp 169.254.169.254:80: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

detailed logs from CI: es-sigs_azuredisk-csi-driver/1054/pull-kubernetes-e2e-capz-azure-disk-windows/1498155530961555456/build-log.txt

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

andyzhangx commented 2 years ago

cc @CecileRobertMichon @marosset this is the reason why CSI driver does not work on Windows node now.

marosset commented 2 years ago

cc @jsturtevant @devigned We had some discussions about blocking instance metadata endpoints for Windows nodes but I can't find the issues/PRs at the moment (maybe they are in the image builder repo)?

marosset commented 2 years ago

Found the PR to block this - https://github.com/kubernetes-sigs/image-builder/pull/694

jsturtevant commented 2 years ago

We block containers access to the wire server here: https://github.com/kubernetes-sigs/image-builder/pull/719 due to a CVE

marosset commented 2 years ago

From reading through the comments it sounds like we want to run some of the containers as ContainerAdministrator so they can get wire server access.

marosset commented 2 years ago

From reading through the comments it sounds like we want to run some of the containers as ContainerAdministrator so they can get wire server access.

Oops, looks like we went with option 2 which blocks access to wireserver for ContainerAdministrator users.

update: went with option two, adding a group. The blocks access to the wireserver for conatineradministrator and allows for adding permissions for other users/apps.

I'm not sure how to give contiainers access to wireserver without allowing all contianers running as containerdministrator access.

CecileRobertMichon commented 2 years ago

for reference: https://msrc.microsoft.com/update-guide/vulnerability/CVE-2021-27075

marosset commented 2 years ago

I spoke with @jsturtevant and I think the right course of action here is to run the csi-driver containers as HostProcess containers. We can run HostProcess containers as system accounts on the node which can be part of the security group that have wireserver access and also would not require any updates to the csi-driver binaries/logic.

I am curious how this works on Linux. Is wireserver access blocked for all containers on linux? I noticed in the deployment files that the containers use Host networking. Does that allow access to wireserver?

CecileRobertMichon commented 2 years ago

Ref: https://github.com/kubernetes-sigs/image-builder/pull/690/files#diff-adaa5bbfc20ce5e21aed6ea1e95dfca1d060eb4cf11622555d5242007ec02798R33

All traffic on port 80 is blocked. hostNetwork-enabled containers are still be able to reach wireserver.

Why does CSI driver need access to the wireserver? To clarify the fix we implemented does not block IMDS (169.256.169.254). Rather, we block Wireserver endpoint (168.63.129.16).

cc @weinong

marosset commented 2 years ago

Oops, sorry I mixed ups IMDS and wireserver. I'm not sure why IMDS access is blocked here.

marosset commented 2 years ago

@jsturtevant do we need to manually create a route for IMDS endpoints for calico? I see https://github.com/kubernetes-sigs/sig-windows-tools/blob/42d4411003b94e086356f891b278d452fc8f50e8/hostprocess/flannel/flanneld/start.ps1#L28-L31 for flannel (running with host-process containers) but not for calico.

andyzhangx commented 2 years ago

Ref: https://github.com/kubernetes-sigs/image-builder/pull/690/files#diff-adaa5bbfc20ce5e21aed6ea1e95dfca1d060eb4cf11622555d5242007ec02798R33

All traffic on port 80 is blocked. hostNetwork-enabled containers are still be able to reach wireserver.

Why does CSI driver need access to the wireserver? To clarify the fix we implemented does not block IMDS (169.256.169.254). Rather, we block Wireserver endpoint (168.63.129.16).

cc @weinong

@CecileRobertMichon only Azure Disk CSI driver needs IMDS support since it needs to get zone and vm size info

marosset commented 2 years ago

I confirmed that container in aks-engine clusters have access to IMDS. I also confirmed that containers in CAPZ clusters (running both as ContainerUser and ContainerAdministrator) do not. I'll try and figure out why.

HostProcess containers and windows nodes in CAPZ cluster DO have access to IMDS so the issue appears to be in the CNI/calico config issue.

marosset commented 2 years ago

/assign

andyzhangx commented 2 years ago

and when I start a driver pod, it cannot access api-server using kubeconfig on the windows node, error is like following:

2022-03-08T04:29:27.4405461Z stderr F I0308 04:29:27.439569    3996 azure.go:71] reading cloud config from secret kube-system/azure-cloud-provider
2022-03-08T04:29:27.4430533Z stderr F I0308 04:29:27.443053    3996 round_trippers.go:553] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/secrets/azure-cloud-provider  in 1 milliseconds
2022-03-08T04:29:27.4435421Z stderr F W0308 04:29:27.443542    3996 azure.go:78] InitializeCloudFromSecret: failed to get cloud config from secret kube-system/azure-cloud-provider: failed to get secret kube-system/azure-cloud-provider: Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/secrets/azure-cloud-provider": dial tcp 10.96.0.1:443: connectex: A socket operation was attempted to an unreachable network.

So it's related?

marosset commented 2 years ago

and when I start a driver pod, it cannot access api-server using kubeconfig on the windows node, error is like following:

2022-03-08T04:29:27.4405461Z stderr F I0308 04:29:27.439569    3996 azure.go:71] reading cloud config from secret kube-system/azure-cloud-provider
2022-03-08T04:29:27.4430533Z stderr F I0308 04:29:27.443053    3996 round_trippers.go:553] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/secrets/azure-cloud-provider  in 1 milliseconds
2022-03-08T04:29:27.4435421Z stderr F W0308 04:29:27.443542    3996 azure.go:78] InitializeCloudFromSecret: failed to get cloud config from secret kube-system/azure-cloud-provider: failed to get secret kube-system/azure-cloud-provider: Get "https://10.96.0.1:443/api/v1/namespaces/kube-system/secrets/azure-cloud-provider": dial tcp 10.96.0.1:443: connectex: A socket operation was attempted to an unreachable network.

So it's related?

I suspect this may be a different issue. Are you seeing this on Windows Server 2019 or Windows Server 2022 nodes, and also CNI/configuration?

Windows nodes in CAPZ configured with calico with overlay networking (the default in CAPZ) cannot access the IMDS. I tested this with both Windows Server 2019 and Windows Server 2022. I suspect this is a limitation of overlay networking on Windows in general.

Running containers as host-process containers means the containers are on the host network which can access the IMDS endpoints.

I do want to understand why we can't access IMDS endpoints from containers with overlay networking and have asked @daschott to help investigate.

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

jackfrancis commented 2 years ago

@marosset @daschott should we keep this issue open?

marosset commented 2 years ago

@marosset @daschott should we keep this issue open?

We worked around the issue by running the CSI drivers in hostProcess containers which can access metadata. I think we should still try to understand why Windows containers in aks-engine were able to access instance metadata with overlay networking and if there is a way around that.

marosset commented 2 years ago

/remove-lifecycle rotten

daschott commented 2 years ago

I wonder if this has to do with the requirement that only Node IPs are allowed to access the IMDS. https://docs.microsoft.com/en-us/azure/virtual-machines/windows/instance-metadata-service?tabs=windows#known-issues-and-faq

@marosset can you try to add destination based OutboundNAT? In Azure CNI you can add this as follows:

                                               {
                                                   "Name":  "EndpointPolicy",
                                                   "Value":  {
                                                                 "Type": "LoopbackDSR",
                                                                 "IPAddress": "169.254.169.254"
                                                             }
                                               },

In HNS endpoint you should see the following policy added, which you should be able to add as-is to the sdnoverlay CNI config.

               {
                             "Destinations":  [
                                                  "169.254.169.254"
                                              ],
                             "Type":  "OutBoundNAT"
                 }
k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 1 year ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/2132#issuecomment-1336540196): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
marosset commented 1 year ago

/reopen /lifecycle frozen

k8s-ci-robot commented 1 year ago

@marosset: Reopened this issue.

In response to [this](https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/2132#issuecomment-1337886844): >/reopen >/lifecycle frozen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
willie-yao commented 1 year ago

@marosset Should we keep this issue open? I saw in #3283 that you mentioned for why IMDS is not reachable on CAPZ:

I believe it is a limitation on overlay networking on Windows in general (not specific to calico)

jsturtevant commented 1 year ago

let's keep open

andyzhangx commented 1 year ago

we workaround this issue by https://github.com/kubernetes-sigs/azuredisk-csi-driver/pull/1200, if IMDS is not available, the driver would get instance type from node labels, so host process deployment is not mandatory in this case.

I0617 13:33:09.076571    5940 utils.go:77] GRPC call: /csi.v1.Node/NodeGetInfo
I0617 13:33:09.076571    5940 utils.go:78] GRPC request: {}
W0617 13:33:30.089992    5940 nodeserver.go:382] get instance type(capz-8ken-5z262) failed with: Get "http://169.254.169.254/metadata/instance?api-version=2021-10-01&format=json": dial tcp 169.254.169.254:80: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
W0617 13:33:30.092123    5940 nodeserver.go:385] fall back to get instance type from node labels
I0617 13:33:30.096506    5940 round_trippers.go:553] GET https://10.96.0.1:443/api/v1/nodes/capz-8ken-5z262 200 OK in 4 milliseconds
I0617 13:33:30.098487    5940 nodeserver.go:431] got a matching size in getMaxDataDiskCount, VM Size: STANDARD_D4S_V3, MaxDataDiskCount: 8
mboersma commented 1 year ago

/priority backlog

jsturtevant commented 5 months ago

I wonder if this has to do with the requirement that only Node IPs are allowed to access the IMDS. https://docs.microsoft.com/en-us/azure/virtual-machines/windows/instance-metadata-service?tabs=windows#known-issues-and-faq

@marosset can you try to add destination based OutboundNAT? In Azure CNI you can add this as follows:

                                               {
                                                   "Name":  "EndpointPolicy",
                                                   "Value":  {
                                                                 "Type": "LoopbackDSR",
                                                                 "IPAddress": "169.254.169.254"
                                                             }
                                               },

In HNS endpoint you should see the following policy added, which you should be able to add as-is to the sdnoverlay CNI config.

               {
                             "Destinations":  [
                                                  "169.254.169.254"
                                              ],
                             "Type":  "OutBoundNAT"
                 }

This could be a possible solution and there is a work around here.

k8s-triage-robot commented 2 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 1 week ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 1 week ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/2132#issuecomment-2308965618): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.