Open anuraaga opened 8 years ago
I found that replacing --dns=8.8.8.8 with mounting the host's /etc/resolv.conf works. Is there a reason not to mount resolv.conf for this command?
Works:
- path: /opt/bin/decrypt-tls-assets
owner: root:root
permissions: 0700
content: |
#!/bin/bash -e
for encKey in $(find /etc/kubernetes/ssl/*.pem.enc);do
sudo rkt run \
--volume=ssl,kind=host,source=/etc/kubernetes/ssl,readOnly=false \
--mount=volume=ssl,target=/etc/kubernetes/ssl \
--uuid-file-save=/var/run/coreos/decrypt-tls-assets.uuid \
--volume=dns,kind=host,source=/etc/resolv.conf,readOnly=true --mount volume=dns,target=/etc/resolv.conf \
--net=host \
--trust-keys-from-https \
quay.io/coreos/awscli --exec=/bin/bash -- \
-c \
"/usr/bin/aws \
--region {{.Region}} kms decrypt \
--ciphertext-blob fileb://$encKey \
--output text \
--query Plaintext \
> $encKey.b64"
base64 --decode < $encKey.b64 > ${encKey%.enc}
sudo rkt rm --uuid-file=/var/run/coreos/decrypt-tls-assets.uuid
done
:+1: i think we should be careful in general with peppering code to use 8.8.8.8
.
Hello Kubernetes Community,
Future work on kube-aws will be moved to a new dedicated repository. @mumoshu will be running point on maintaining that repository- please move all issues and PRs over there as soon as you can. We will be halting active development on the AWS portion of this repository in the near future. We will continue to maintain the vagrant single and multi-node distributions in this repository, along with our hyperkube container image image.
A community announcement to end users will be made once the transition is complete. We at CoreOS ask that those reading this message avoid publicizing/blogging about the transition until the official annoucement has been made to the community in the next week.
The new dedicated kube-aws repository already has the following features merged in:
If anyone in the Kubernetes community would like to be involved with maintaining this new repository, find @chom and/or @mumoshu on the Kubernetes slack in the #sig-aws channel or via direct message.
~CoreOS Infra Team
@anuraaga Just curious but could it be possible that your infrastructure/network are blocking access to Google Public DNS?
// Once I understand the problem correctly, I'd like to merge https://github.com/coreos/kube-aws/issues/6 asap!
Thanks for checking on this, I should have verified the lookup against 8.8.8.8 on the node once with dig. Indeed, this wasn't working
dig @8.8.8.8 kms.ap-northeast-1.amazonaws.com
It's weird since I had opened up port 53 for both TCP and UDP on the network ACL. I found that the only way I could reliably get that dig command to work was to open up UDP 1-65535. Cutting the range down to, e.g. 400000-65535 would allow maybe 30% of the requests to work, as if it's just picking a random port each time. Not sure why this behavior would happen. Definitely don't want to have to open up all these ports.
Anyways, would still prefer to have DNS requests in general not leaving the VPC even during Kubernetes bootstrap if it makes sense.
@anuraaga Thanks for your response 😄 I guess you should open up ports from 32768 to 61000 in one of network ACL's outbound rules.
AFAIK, or I believe that, linux kernels generally use ephemeral ports ranging from 32768 to 61000 hence DNS clients also use ephemeral ports for source ips. This documentation for AWS VPC would help.
// I'm not certainly sure how exactly DNS clients work so please correct me if this seems wrong.
If you have some time, I suggest running tcpdump port 53 -nn
and then dig
in another console to see SRC IP
varies in the range mentioned above, which means you should open up the range used for ephemeral ports in outbound rules.
Anyway,
would still prefer to have DNS requests in general not leaving the VPC even during Kubernetes bootstrap if it makes sense.
I agree to this 👍
Thanks for the explanation, my understanding of ephemeral ports was lacking but it makes sense :)
Thanks for your confirmation! FYI, https://github.com/coreos/kube-aws/pull/6 is merged.
I am trying to start a cluster using kube-aws 0.8.3, but the controller fails to start up because decrypt-tls-assets.service fails. Trying to restart it manually results in the same failure so it's not sporadic.
The error is
awscli[5]: Could not connect to the endpoint URL: "https://kms.ap-northeast-1.amazonaws.com/"
I can use curl to access the URL from the node
When running curl in rkt, it cannot resolve the hostname
However ping 8.8.8.8 within rkt works fine.