golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.68k stars 17.62k forks source link

crypto/tls: add (*tls.Conn).HandshakeContext and add context to ClientHelloInfo and CertificateRequestInfo #32406

Closed johanbrandhorst closed 3 years ago

johanbrandhorst commented 5 years ago

Proposal

I propose an unexported context.Context field is added to the ClientHelloInfo and CertificateRequestInfo crypto/tls types. We should also add a Context() context.Context method to these types to access this context. Further, we should add a new method, HandshakeContext(context.Context) error to the tls.Conn struct, which will be used to propagate a context down the handshake call stack. The existing Handshake() error would call to the new method with a context.Background() context.

Standard library uses of (*tls.Conn).Handshake() should be moved over to the new method, where appropriate. For example, it is not clear that it is appropriate to change the existing Read and Write methods on the *tls.Conn to use the new handshake method, but in (*net/http.Server).serve, it is clear that moving to the new function would enhance the request lifetime control in the function.

The context.Context provided to HandshakeContext would only be used for cancellation of the handshake itself, and once the handshake has completed, cancelling the context will have no effect. This is in line with the predecent set by (*net.Dialer).DialContext.

Motivation

In recent Go releases, we've been able to use the handy GetCertificate and GetClientCertificate methods of the *tls.Config to dynamically control certificate management in Go apps. This is fantastic, and has lead to things like https://godoc.org/golang.org/x/crypto/acme/autocert and https://github.com/johanbrandhorst/certify which are somewhat unique to the Go ecosystem.

Unfortunately, one glaring omission from the API is a connection context for cancellation and request scoped variable propagation. This means users have to implement custom timeouts or block their TLS connections forever in case of problems. It also means powerful observability tools like tracing and metrics that make use of the context cannot be used.

Interaction with net/http

net/http.Server provide BaseContext, which is used to set a global context for the duration of (*http.Server).Serve, and ConnContext, which is used on every new connection. The context passed to (*tls.Conn).HandshakeContext would necessarily be a child of these contexts, as the existing Handshake call is made after these contexts are created. See

FiloSottile commented 5 years ago

It would help if you could elaborate on the various use cases: what you are trying to do in each situation, what doesn't work at the moment, and how a context would help.

I've in the past wanted to surface details of the ClientHelloInfo to net/http Handlers, so I can see the use case, but I'd like to build a generic solution.

johanbrandhorst commented 5 years ago

My use case specifically is to allow my library (certify) to cancel its outgoing requests if the incoming connection is closed. Additionally, it would allow detailed tracing to capture the latency cost of dynamically provisioned TLS certificates, something that is currently hidden inside the TLS handshake time in the standard library. Simply having a context associated with the underlying connection that could be used in outgoing net/http requests would be enough.

bradfitz commented 5 years ago

@johanbrandhorst, sounds like good reasons. For the same reason we didn't add a Context struct field to net/http.Request and used a method instead, we should instead add a Context method to crypto/tls.ClientHelloInfo.

johanbrandhorst commented 5 years ago

OK, I will attempt to implement this.

johanbrandhorst commented 5 years ago

Do the tags need updating?

gopherbot commented 5 years ago

Change https://golang.org/cl/181097 mentions this issue: crypto/tls, net/http: add context to tls structs

johanbrandhorst commented 5 years ago

Since the tree has recently opened again, bumping this for another look at the initial implementation. Could we tag this with 1.14? I think this proposal was accepted in https://github.com/golang/go/issues/32406#issuecomment-498709215.

mvdan commented 5 years ago

@bradfitz @FiloSottile could you please clarify whether this proposal is accepted? If so, we can milestone it for 1.14 and review the CL.

bradfitz commented 5 years ago

Looks like it's stuck in the Crypto Proposal Review queue. @FiloSottile owns that meeting.

johanbrandhorst commented 5 years ago

Friendly ping on this. @FiloSottile is there an update on this?

johanbrandhorst commented 5 years ago

Friendly ping.

johanbrandhorst commented 4 years ago

Friendly ping. @FiloSottile anything I can do to help with the Crypto Proposal Review queue?

johanbrandhorst commented 4 years ago

Friendliest of bumps.

johanbrandhorst commented 4 years ago

With 1.14 out, is there a plan to review the crypto proposals?

rsc commented 4 years ago

@johanbrandhorst, crypto proposal review is proceeding fairly well (see all the crypto proposals in the minutes at https://golang.org/s/proposal-minutes), but there are many.

rsc commented 4 years ago

Ping @FiloSottile and @katiehockman.

FiloSottile commented 4 years ago

Hey, sorry for the delay, this one fell off my radar.

Another common problem that would be nice to solve at the same time is propagating info from the callbacks to net/http handlers. Maybe exposing this context also from ConnectionState would work, as it's exposed as Request.TLS? Or should the net/http Request context be a child of the TLS one? (How would that work with Server.BaseContext and Server.ConnContext?)

Another reason to expose it on ConnectionState is that the new VerifyConnection callback gets a ConnectionState as input. All other callbacks should be covered by ClientHelloInfo and CertificateRequestInfo.

Can we get a full API proposal here on the issue, along with details of how it would interact with net/http?

johanbrandhorst commented 4 years ago

Thanks for looping back on this @FiloSottile, I've updated the first post with a full API proposal. It does not support passing data back via the context as it is read only. We would need to make the context an exported field to support this, which was argued against in https://github.com/golang/go/issues/32406#issuecomment-498709215. I'm open to discuss other solutions, or changing the context to be an exported field.

What are your thoughts?

FiloSottile commented 4 years ago

If the TLS context needs to be a child of the ConnContext, how does that happen? I can't see a way for crypto/tls to reach the ConnContext, nor for net/http to set the TLS context.

I unfortunately haven't designed a lot of APIs with Context, so I could use some advice (maybe from @bcmills?) on this proposal.

johanbrandhorst commented 4 years ago

I'm not an expert on the net/http server by any stretch, but I thought we could assign to it here: https://github.com/golang/go/blob/b371f189dfdfb2454a20ec276de55fe884d6ff9f/src/net/http/server.go#L1782, where we have both the ctx inherited from ConnContext and the *tls.Conn.

FiloSottile commented 4 years ago

Assign it how? The proposal does not expose any way to set the Context.

johanbrandhorst commented 4 years ago

Ah, good point, we would have to expose something like a WithContext to assign a context to the conn.

I will update the proposal.

johanbrandhorst commented 4 years ago

I've added the (*crypto/tls.Conn).WithContext method description to the proposal.

rsc commented 4 years ago

ping @filosottile and @katiehockman

rsc commented 4 years ago

Talked to @FiloSottile.

http.Request.WithContext returns a derived http.Request that now has the context. In the callback you are trying to use, there's no way to return a new tls.Conn, so the tls.Conn.WithContext is a setter - it mutates the receiver instead of returning a derived copy.

So at least in this proposal, WithContext should be named SetContext.

But then the problem is this is the first SetContext we have in the standard library, and it's unclear that's the right path to go down.

/cc @bcmills @Sajmani

johanbrandhorst commented 4 years ago

I agree with your analysis and that SetContext would set a potentially dangerous precedent. I will re-examine the problem and see if there's a better way of merging the HTTP context into the tls connection.

johanbrandhorst commented 4 years ago

I've updated the proposal to change the use of a SetContext method to instead create a new HandshakeContext method on the tls.Conn. This method would be used to propagate the context down the call stack into the GetClientCertificate and GetCertificate callbacks. PTAL.

gopherbot commented 4 years ago

Change https://golang.org/cl/246338 mentions this issue: DO NOT REVIEW: crypto/tls: add HandshakeContext method to Conn

johanbrandhorst commented 4 years ago

I took a stab at what this would look like here: https://go-review.googlesource.com/c/go/+/246338. Forgive me if it's inappropriate to make this sort of experimental change before a proposal has been accepted.

rsc commented 4 years ago

I admit I am a bit confused about how HandshakeContext would be used in that CL. I was expecting maybe a new callback that returns a context. What would the client use of HandshakeContext instead of the old callbacks look like?

johanbrandhorst commented 4 years ago

HandshakeContext passes the context down the stack so that the callbacks (which have access to ClientHelloInfo and CertificateRequestInfo respectively) can access the request context via Context() on the two structs.

johanbrandhorst commented 4 years ago

I think I may have misunderstood; are you asking how HandshakeContext is used in the http.Client? If so, I'm not sure I have an answer for that, it seems to me that we don't have access to the context where we're doing the handshake on the client side right now. I will revisit the CL to see if that can be changed without too much disruption.

johanbrandhorst commented 4 years ago

I've updated the CL, replacing the only other call to (*tls.Conn).Handshake in net/http with HandhakeContext using the request context. The CL should now handle all TLS handshakes in the client and server so that both client and server callbacks have access to the request context.

rsc commented 4 years ago

/cc @FiloSottile

johanbrandhorst commented 4 years ago

Friendliest of bumps

rsc commented 4 years ago

@johanbrandhorst, I'm still a bit confused. What I see in your CL is code that is plumbing contexts down to the Handshake and other getters, but I don't see how what's there would help with what @FiloSottile said above:

Another common problem that would be nice to solve at the same time is propagating info from the callbacks to net/http handlers. Maybe exposing this context also from ConnectionState would work, as it's exposed as Request.TLS? Or should the net/http Request context be a child of the TLS one? (How would that work with Server.BaseContext and Server.ConnContext?)

Another reason to expose it on ConnectionState is that the new VerifyConnection callback gets a ConnectionState as input. All other callbacks should be covered by ClientHelloInfo and CertificateRequestInfo.

Can we get a full API proposal here on the issue, along with details of how it would interact with net/http?

I can't tell from the CL what you are proposing to answer this part. Can you explain here in a comment? Maybe the answer is "don't do that", but if so it would be good to understand why.

johanbrandhorst commented 4 years ago

Ah, I see, sorry for the confusion. I had hoped that would be possible when I first started looking into it, but I now believe this is not going to be possible without either;

  1. Adding a mutable context to ConnectionState, ClientHelloInfo and CertificateRequestInfo. This is undesirable because, as mentioned before, it sets a precedent of being able to mutate a context on a struct rather than, as popularized by request.WithContext, allowing immutable type instances to be overwritten by a copy of the instance with a context added. I don't really want to go down that path.
  2. Returning a context from the callbacks. This is kind of a nonstarter since it'd either require us to make a breaking change or require us to add a new variant of each of the callbacks, something I'm sure none of us want.

I can't think of any other alternatives for passing a new context back from the callbacks. Maybe I've missed something? In any case, my current proposal does not have a working solution for this problem. The current proposal only tackles the part about propagating the context from the HTTP server down into the TLS callbacks, with no way of returning any new data from the callbacks back up the call stack.

I've taken the time to update my proposal and CL to include exposing the context on the ConnectionState too.

rsc commented 4 years ago

It sounds like this proposal is only going to deal with passing a context down to the various GetCertificate callbacks, by adding a Context() context.Context method to the info structs.

It is not handling passing information back from handlers up to the tls.Conn and from there to http.Handlers. We don't know the right way to do that, but the two different directions would almost certainly be different APIs anyway.

@FiloSottile, are you are OK with proceeding with the downward direction separate from the upward one?

rsc commented 4 years ago

ping @FiloSottile

FiloSottile commented 4 years ago

Yeah, it's unfortunate we don't get to use the Context in both directions from the callbacks, but it sounds like there would be no good way to do that. I'm ok with not solving that part of the problem.

What I'm still trying to wrap my head around is the semantics of HandshakeContext. Is that context only covering the handshake operation or the entire connection? The former sounds like the expected behavior, but it also sounds less useful, because it only allows cancelling the handshake, not the whole connection. If the former, what does ConnectionState.Context return after the handshake has completed? If the latter, it feels awkward to use the Handshake operation to attack a connection-scoped context.

Moreover, under what conditions should the context exposed to the callbacks get cancelled? Just when the parent context passed to HandshakeContext is cancelled, or should we also monitor for connection closure?

rsc commented 4 years ago

@johanbrandhorst Is your CL up to date for the most recent discussion?

johanbrandhorst commented 4 years ago

I have not yet had time to think about Filippos latest comments (thanks!) so the CL is only up-to-date given current proposal details. I will set aside some time to consider what changes to make in response to Filippos comments tomorrow.

johanbrandhorst commented 4 years ago

What I'm still trying to wrap my head around is the semantics of HandshakeContext. Is that context only covering the handshake operation or the entire connection? The former sounds like the expected behavior, but it also sounds less useful, because it only allows cancelling the handshake, not the whole connection.

I propose that the context covers the handshake operation, much like (*net.Dialer).DialContext. Cancellation of the context after the handshake has completed has no effect on the connection.

If the former, what does ConnectionState.Context return after the handshake has completed? If the latter, it feels awkward to use the Handshake operation to attack a connection-scoped context.

I agree that a context associated with the ConnectionState implies to the user that it is relevant for the lifetime of the connection. I have updated the proposal to no longer add the context provided to HandshakeContext to ConnectionState, in line with my original proposal. I think we can consider what adding a context to ConnectionState would mean semantically separately from this proposal. I have also updated the CL.

Moreover, under what conditions should the context exposed to the callbacks get cancelled? Just when the parent context passed to HandshakeContext is cancelled, or should we also monitor for connection closure?

We should not need to add any additional cancellation logic to the context after it is passed to HandshakeContext, it should only be subject to the cancellation imposed by the context provided. Further cancellation logic within the TLS stack could be something added in another proposal, if it becomes clear that this initial attempt is not sufficient.

FiloSottile commented 4 years ago

Thank you @johanbrandhorst for bearing with us through the iterations.

I suppose we could also make ConnectionState.Context() return nil if HandshakeComplete is true. A lot of the fields already depend on that. We can also call the method ConnectionState.HandshakeContext() if we want to be extra clear.

Now that I have clear what the intended behavior is, can you help me understand an example use case for it? I'm sure there are but I haven't found myself using contexts in practice that often (due to not building whole applications most of the time).

johanbrandhorst commented 4 years ago

I added some details in the motivation section of the proposal, but briefly;

I've written a library that can issue certificates on-demand, and it hooks into the TLS callbacks. Unfortunately, because this often requires talking to a third party CA (Such as Vault or CFSSL), there is often a large amount of time spent in I/O. Since the request context is not propagated into the callbacks, cancellation and timeouts have to be configured separately, and your request could wait for seconds after cancellation before it is actually cancelled. Also because the context and its request scoped variables aren't propagated into the callbacks, tools such as tracing cannot see where the time is spent.

I hope that explains the use case I'm looking to answer.

FiloSottile commented 4 years ago

That makes sense, thank you.

Opinions on this part?

I suppose we could also make ConnectionState.Context() return nil if HandshakeComplete is true. A lot of the fields already depend on that. We can also call the method ConnectionState.HandshakeContext() if we want to be extra clear.

johanbrandhorst commented 4 years ago

Opinions on this part?

I suppose we could also make ConnectionState.Context() return nil if HandshakeComplete is true. A lot of the fields already depend on that. We can also call the method ConnectionState.HandshakeContext() if we want to be extra clear.

It doesn't feel like a great user experience to have a Context method that sometimes returns nil. HandshakeContext could work but I still don't know about any use case that requires it. My original use case only uses the first two callbacks. I think any addition to ConnectionState should be considered separately, as and when required by a use case.

FiloSottile commented 4 years ago

Fair enough. To recap, here are the API changes. I'm happy with this.

(*ClientHelloInfo) Context() context.Context
(*CertificateRequestInfo) Context() context.Context
(*Conn) HandshakeContext(ctx context.Context) error

Two comments based on the CL:

The good news is that we can probably replace some of the Dial timeout complexity with HandshakeContext timeouts.

johanbrandhorst commented 4 years ago

Awesome, thank you for your thoughtful comments. I've implemented your first two suggestions and update the CL.

close the connection and kill the handshake when the context is cancelled, if I pass a context with a timeout to HandshakeContext I expect that timeout to apply to the handshake

Do you have any tips on how to accomplish this? A brief look at the code implies this would involve implementing a cancellable io.Conn wrapper, which is usually done by setting deadlines mid-connection (I implemented it for the varlink library, but it's a bit messy and I'm wondering if I'm missing some more obvious solution? AFAIK there isn't an existing implementation of a cancellable io.Conn in the standard library.

We can move this dicussion to https://go-review.googlesource.com/c/go/+/246338 if you like.

FiloSottile commented 4 years ago

Yeah, we can move to Gerrit if we all agree on how the behavior should look like!

Do you have any tips on how to accomplish this?

Should be enough to start a goroutine in HandshakeContext that closes the connection if the context is cancelled before the function returns. Will need to take care to return the context error instead of the network error in that case.