siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.94k stars 559 forks source link

Avoid using --insecure during bootstrap #9521

Open shellwhale opened 1 month ago

shellwhale commented 1 month ago

Hello.

From my understanding the initial configuration, the bootstrap phase, is not authenticated. This means this is a Trust On First Use (TOFU) authentification scheme, which is vulnerable to Man-in-the-middle attacks.

Is there a way to embed a certificate inside of the image, before the initial configuration that happens over the network? That way we could get rid of the --insecure flag.

frezbo commented 1 month ago

bootstrap API requires proper certificate, initial apply-config can only use --insecure

shellwhale commented 1 month ago

bootstrap API requires proper certificate, initial apply-config can only use --insecure

Yes that's why I'm asking, it does not feel appropriate to use an insecure authentification scheme right from the start.

Again, can't I generate myself a certificate, embed it in the image somehow, maybe as a step from factory.talos.dev or using the imager?

shellwhale commented 1 month ago

Anyone with network access can configure the machine, or impersonate it, isn't that an issue?

shellwhale commented 1 month ago

There,

https://www.talos.dev/v1.8/learn-more/image-factory/#schematics https://www.talos.dev/v1.8/talos-guides/install/boot-assets/#image-factory

I feel this is where there should be an explanation on how to setup a custom certificate, at the image level, not at the network level. But I can't find any.

smira commented 1 month ago

Is there a way to embed a certificate inside of the image, before the initial configuration that happens over the network? That way we could get rid of the --insecure flag.

If you follow console output of Talos, you would notice that it prints the fingerprint of its own certificate which you can use with --insecure apply config. This is already implemented.

You can submit machine configuration to Talos via many other methods as well, which have their own pros and cons, but if you worry about man-in-the-middle specifically, use the fingerprint shown in the console.

shellwhale commented 1 month ago

@smira but isn't the CA the same for everyone?

shellwhale commented 1 month ago

Is there a way to embed a certificate inside of the image, before the initial configuration that happens over the network? That way we could get rid of the --insecure flag.

If you follow console output of Talos, you would notice that it prints the fingerprint of its own certificate which you can use with --insecure apply config. This is already implemented.

So this means that the CLI client trusts server certificates signed by some certificate authority, the private key associated with that CA must be stored on the Talos image itself if the signing does happen at the server boot time.

Is that what happens? @smira

smira commented 1 month ago

@smira but isn't the CA the same for everyone?

There is no CA, Talos without machine configuration generates a fresh self-signed certificate to run what we call "maintenance service API". The fingerprint is printed to the console, so you can use that on client side (talosctl) to ensure you're talking to the machine you intend to send machine configuration to.

So we're talking here about delivering machine configuration to a Talos node, there are several options, not all of them are same, but here is the list:

So there are many options, each with its own set of pros and cons, better UX or better security. There's a question of running untrusted workloads and whether workloads can access the same config source (e.g. for cloud user-data).

So there is no answer that fits all cases, it's better to ask a specific question - how do I deliver the machine configuration to the node given the requirements that I have.