Closed alban closed 4 years ago
This patch makes Lokomotive provide its own kubelet-wrapper and etcd-wrapper scripts so that it is not dependent on the a specific channel of Flatcar.
I guess there was a reason to couple Flatcar Linux with etcd-wrapper, at least (probably kubelet-wrapper too). Is this just overriding it or the plan is to remove it from Flatcar Linux and have Lokomotive handle it? Would it make sense to double check why this was coupled in the first place? IMHO, that would be ideal.
Also, if we are removing these from Flatcar Linux, couldn't this affect other Flatcar Linux users not using Lokomotive?
I'm not sure why it was coupled in the first place.
We'll not remove them from Flatcar immediately, it should follow a deprecation period. Note that the path is different: Flatcar has the wrapper scripts in the read-only /usr and this Terraform patch adds the scripts in /etc/kubernetes (it needs a mount with write access).
@alban oh, okay. It sounds reasonable to me, then. Thanks! :-)
@rata we don't remove them from Flatcar. We just make it independent, so we have better control over it in Lokomotive.
I guess there was a reason to couple Flatcar Linux with etcd-wrapper, at least (probably kubelet-wrapper too).
I'm not aware of the exact history of kubelet-wrapper, etc. I suppose what happened was: kubelet-wrapper was first written only for CoreOS Container Linux. After that, CoreOS built Tectonic upon Container Linux, by simply calling kubelet-wrapper that was supposed to be already included in the underlying OS. Afterwards, Typhoon was forked from Tectonic, Lokomotive was forked from Typhoon, but the coupling has not changed.
How does Typhoon solves that on non-CoreOS distros?
On Wed, Nov 27, 2019 at 11:43 AM Mateusz Gozdek notifications@github.com wrote:
How does Typhoon solves that on non-CoreOS distros?
It runs it with totally different approach, this issue has some of the reasoning: https://github.com/poseidon/typhoon/issues/91#issuecomment-386847684
How does Typhoon solves that on non-CoreOS distros?
Good question. A while ago, it was using "atomic install" to set up a super-privileged container and pull the kubelet from quay.io/poseidon/kubelet:v1.14.1
:
https://github.com/poseidon/typhoon/blob/5eb11f510493bdcfac83ae530552a4f144fbd0b1/aws/fedora-atomic/kubernetes/cloudinit/controller.yaml.tmpl#L95
But then, "atomic install" has been deprecated and it is not used in Typhoon anymore: https://github.com/poseidon/typhoon/pull/501/files
Now it is using podman: https://github.com/poseidon/typhoon/blob/ddea7dc45252080eefabaf26d3bd18c3931abfb1/aws/fedora-coreos/kubernetes/workers/fcc/worker.yaml#L34
Merge conflicts here.
Ping @alban
I am not working on it at the moment. If someone has the bandwidth to take over this PR, that's fine for me. The PR is not ready at the moment: it needs changes for other cloud providers and other things in the TODO list.
I will close it for now then and we can come back to in in the future.
Would be great to have this picked up again. The Flannel ToDo can be ignored since it is not supported anymore.
If AWS user data is still a problem, you could also store the scripts in the GitHub repository and refer to them via URL in Ignition:
"contents": {
"source": "https://raw-github-url",
"verification": { SHASUM if wanted }
}
The kubelet-wrapper and etcd-wrapper scripts are part of Flatcar Container Linux. They are tightly coupled with the service files kubelet.service and etcd-member.service, respectively. The service files are part of Lokomotive.
This coupling consists of environment variables and container runtime parameters. This causes problem because any change (e.g. moving from rkt to Docker) requires synchronisation between Flatcar and Lokomotive releases.
This patch makes Lokomotive provide its own kubelet-wrapper and etcd-wrapper scripts so that it is not dependent on the a specific channel of Flatcar.
This patch also changes the container runtime to Docker.
Since Docker does not support sd_notify, I remove the related code. To fix this, we could use sdnotify-proxy. See links: https://www.freedesktop.org/software/systemd/man/sd_notify.html https://github.com/coreos/sdnotify-proxy
In practice, Lokomotive works without sd_notify, but other services that have a dependency on etcd-member might have not run in the correct order. I found only one service in this case: flanneld:
This patch only fixes AWS: packet and others are left as TODO.
I expect that kubelet-wrapper and etcd-wrapper can be shared for all cloud providers, so I've put them in the 'common' directory. This will make future changes easier to manage.
To make terraform modules self-sufficient (e.g. aws/flatcar-linux/kubernetes), required files should normally not be outside of the root directory of the terraform module (e.g. no access via ../../../). I've added symbolic links to the common directory. If you create assets with vfsgen, the symbolic links are read as regular files so they get added several times in the assets, making the terraform module self-sufficient.
Note: with this patch, we are dangerously close to the 16384 size limit on AWS user_data. In my tests, we're around 15860.
TODO: