Azure / kubernetes-keyvault-flexvol

Azure keyvault integration with Kubernetes via a Flex Volume
MIT License
253 stars 83 forks source link

Azure Firewall blocking integration due to missing SNI header #137

Closed jmcshane closed 4 years ago

jmcshane commented 4 years ago

Describe the bug

When I enable Azure firewall to limit egress traffic (as described here: https://docs.microsoft.com/en-us/azure/aks/limit-egress-traffic), I get a failure to mount the keyvault volume due to the following being blocked by Azure Firewall:

HTTPS request from 10.0.8.4:49424. Action: Deny. Reason: SNI TLS extension was missing.

In kubernetes, the error is:

Unable to mount volumes for pod "myapp-69b5f597f7-krf8l(953c9b05-e939-11e9-9b6d-9eab362ae0e5)": timeout expired waiting for volumes to attach or mount for pod "default"/"myapp-69b5f597f7-krf8l". list of unmounted volumes=[secrets]. list of unattached volumes=[secrets default-token-hbpdc]

Steps To Reproduce

Build AKS cluster, set up Azure Firewall as egress monitor using https://docs.microsoft.com/en-us/azure/firewall/integrate-lb, enable traffic on 443 to AzureKeyVault but not to all of AzureCloud tag.

When I enable traffic to port 443 to all of AzureCloud, I am able to retrieve the secret. This means that there's something missing from the headers as the request is going out and the destination IP is not recognized as AzureKeyVault.

Expected behavior

Azure KeyVault FlexVolume mounts correctly.

Key Vault FlexVolume version

v0.0.14

Access mode: service principal or pod identity

Pod Identity

Kubernetes version

1.14.6

Additional context

I can add a comment outlining the firewall egress rules configured via terraform.

jmcshane commented 4 years ago

Some more information from the logs on the AKS host:

Mon Oct 7 19:54:28 UTC 2019 PODNAME: myapp-69b5f597f7-crmmb\nMon Oct 7 19:54:28 UTC 2019 mount\nMon Oct 7 19:54:28 UTC 2019 /etc/kubernetes/volumeplugins/azure~kv/azurekeyvault-flexvolume -logtostderr=1 -vaultName=vault-ns1ck0ez -vaultObjectNames=cosmosdb-connection-string;api-ssl-cert;api-ssl-key;appinsights-instrumentationkey -vaultObjectAliases=cosmosdb-connection-string;server.pem;server.key;appinsights-instrumentationkey -dir=/var/lib/kubelet/pods/4512f007-e93c-11e9-9b6d-9eab362ae0e5/volumes/azure~kv/test -cloudName= -tenantId=********************* -aADClientSecret=**** -aADClientID= -usePodIdentity=true -podNamespace=default -podName=myapp-69b5f597f7-crmmb -vaultObjectVersions= -vaultObjectTypes=secret;secret;secret;secret\nI1007 19:54:28.162298 71953 keyvaultFlexvolumeAdapter.go:33] azurekeyvault-flexvolume 0.0.14\nI1007 19:54:28.162383 71953 keyvaultFlexvolumeAdapter.go:42] starting the azurekeyvault-flexvolume, 0.0.14\nI1007 19:54:28.162472 71953 oauth.go:123] azure: using pod identity to retrieve token for default/myapp-69b5f597f7-crmmb\nMon Oct 7 19:54:34 UTC 2019 ismounted | mounted\nMon Oct 7 19:54:34 UTC 2019 umount\nMon Oct 7 19:54:34 UTC 2019 rmdir\nMon Oct 7 19:54:34 UTC 2019 INFO: {\"status\": \"Success\"}

So its making the call to Vault correctly, its just not able to mount the flexvol

brgsstm commented 4 years ago

Did you receive any response related to this? Facing the exact same issue. Thanks.

aramase commented 4 years ago

@jmcshane @brgsstm could you provide the firewall egress rules being configured? I can then try to reproduce this.

jmcshane commented 4 years ago

@aramase I create fqdn rules for each of the entries in this document https://docs.microsoft.com/en-us/azure/aks/limit-egress-traffic

je2ryw commented 4 years ago

Similar issue here as well but with the ingress controller deployment. With all suggested Azure Firewall egress rules, the nginx ingress controller pods failed with both Liveness and Readiness. In the "AzureFirewallApplicationRule" log, many "HTTPS request from 10.21.131.x:xxxx. Action: Deny. Reason: SNI TLS extension was missing".

aramase commented 4 years ago

While using access mode - pod identity, this can help with resolving this issue - https://github.com/Azure/aad-pod-identity/issues/467#issuecomment-577045311

For all other components, if deployed in non kube-system namespace on aks cluster, then with egress lock down enabled, the API server port:443 needs to be whitelisted (https://docs.microsoft.com/en-us/azure/aks/limit-egress-traffic#required-ports-and-addresses-for-aks-clusters). But instead if you want to only whitelist the fqdn of the API server, then you can override the KUBERNETES_SERVICE_HOST env var to be the fqdn of the API server instead of the IP.

je2ryw commented 4 years ago

Thanks @aramase! The suggested solution solved the ingress controller issue. I used following 2 commands to override the api server setting for the controller deployment:

kubectl set env deployment/nginx-ingress-xxx-controller KUBERNETES_SERVICE_HOST=xxx.azmk8s.io kubectl set env deployment/nginx-ingress-xxx-controller KUBERNETES_SERVICE_PORT=443

r0bnet commented 4 years ago

I could fix this yesterday by allowing traffic to ports 22, 443 and 9000 (to any target) and then i didn't have the problem anymore and my services could connect to the API service. I had 22 and 9000 already added but 443 was missing. Not 100% sure but i think MS added this to their documentation just recently. Anyway we could now get rid of @je2ryw's workaround by setting those vars explicitly since we have more than one service that is speaking to API service.

aramase commented 4 years ago

Closed with https://github.com/Azure/aad-pod-identity/pull/488