siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.85k stars 549 forks source link

Find a way to use `grpc-go >= 1.67.0` with `ALPN` disabled #9463

Closed DmitriyMV closed 3 weeks ago

DmitriyMV commented 1 month ago

Starting with v1.67.0 Go gRPC implementation no longer work with TLS connections that don't support ALPN. There is a special header GRPC_ENFORCE_ALPN_ENABLED which, when set to false, should disable this behavior. But setting it after the process started (even in init block) has no effect, because relevant variable initialization happens before any init blocks.

smira commented 1 month ago

My first question - does Talos set up TLS wrong way so that it's not ALPN-enabled? We should fix it in the first place.

howardjohn commented 1 month ago

@smira just stumbled upon this while working on another project. I found if you use GetConfigForClient you may not get the ALPN set properly. See https://github.com/cert-manager/istio-csr/pull/422

smira commented 1 month ago

Thank you, that should be the case for us, as we do live cert reload.

DmitriyMV commented 1 month ago

So this is actually interesting: basically all relevant code in Talos uses credentials.NewTLS which internally uses AppendH2ToNextProtos which ensures that we have "h2" in TLS config. Actually all clients and servers, use it. Except in our crypto library we indeed use GetConfigForClient which drops doesn't carry those changes to tls.Config since gRPC NewTLS operate on copy.

What is good, is that even before gRPC 1.67.0 the client would send the proper "h2" tag during negotiation. It should have warned us about incorrect ALPN settings even on v1.65.0 gRPC but I guess we are silencing that log somewhere. I also have no idea how that works in the first place, since Go TLS negotiateALPN should revert to http/1.1 if there is no proper ALPN.

Edit: look like in h2_bundle.go (http2ConfigureServer and http2ConfigureTransports and newTLSConfig method) it does setup NextProtos field.

smira commented 4 weeks ago

Whatever way we end up with fixing it, we should have a way for 1.9 talosctl to talk to 1.8 Talos.

DmitriyMV commented 4 weeks ago

So at the end of the day, Omni needs GRPC_ENFORCE_ALPN_ENABLED=false for integration test and server. For Talos GRPC_ENFORCE_ALPN_ENABLED=false is needed only for talosctl in cases where in tries to install\create older Talos clusters.