Closed dangerousplay closed 4 years ago
Looking into this a bit more it looks like h2 doesn't support sending custom data on the keepalive. https://docs.rs/h2/0.2.1/h2/struct.Ping.html
The other problem here will be that hyper doens't support ping either. So I would think for this custom behavior we would want to have a custom transport. But I don't like the idea of maintaining something that is not hyper. It might be worth it to open an issue for custom pings in the hyper repo.
+1
This would currently be a blocker for us for moving from grpc-rs
.
I am a bit short on time today but the next steps would be to open the corresponding issues in h2
and hyper
around custom pings. The problem I see here that will make this really hard is that we have not figured out how to send a keepalive ping via tower-service. Most likely we will have to offload some of this to hyper.
cc @seanmonstar
Some more details:
h2
allows sending of pings, but doesn't allow customizing the payload (is this required?), and only allows a single outstanding (user) ping at a time.
I've been experimenting using that ping internally in hyper for BDP flow control, so if also used for keep alive, then h2
may need to grow support for different user ping payloads. Maybe worth an issue to discuss.
Besides extra support in h2
, hyper
would need to expose a way to send pings. If so, ideally it's done in a way that supports HTTP3 in the future (does that have pings)?
Or, I suppose it could just grow a couple http2_keepalive
knobs and do it internally?
I did a little searching this is what I found:
So very inconclusive.
Yeah, I think it makes sense for hyper to expose some sort of config option for this. H3 still seems far away but we can introduce it as an http2 option?
Also looks like we may need to echo exactly what is pinged to us in the ping frame.
Reference: https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md#ping-frame
h2
already automatically responds to PING
frames received over the wire.
Relevant PR in hyper providing generic HTTP2 keep-alive support: https://github.com/hyperium/hyper/pull/2151
hyper v0.13.4 now includes options to set some HTTP2 keep-alive options. So, next would be taking advantage of them in tonic.
This has been done in #307
I was testing this out, and it does seem like there is a slight mismatch from the spec in terms of error codes returned. In hyperium/hyper#2151, KeepAliveTimedOut from h2 seems to get mapped to h2::Reason::INTERNAL_ERROR, but from here:
An expired client initiated PING will cause all calls to be closed with an UNAVAILABLE status. Note that the frequency of PINGs is highly dependent on the network environment, implementations are free to adjust PING frequency based on network and application requirements.
The error I was seeing was:
[status: Internal, message: "h2 protocol error: protocol error: unexpected internal error encountered", details: [], metadata: MetadataMap { headers: {} }]
Does tonic get the information it needs to be able to map that to UNAVAILABLE, or would that require a change in h2?
@seanmonstar ^ we likely want to forward that?
We could maybe get hyper's error type to have is_timeout()
return true in that case also, and then maybe tonic could look for that? Or maybe tonic doesn't have that tight coupling with hyper.
I was able to take a closer look at hyper::Error and how it might integrate with tonic::Status. I'll follow up with that in https://github.com/hyperium/tonic/pull/629 once I've had a chance to clean up some of the tests. It seems like there is a reasonable way to handle both the is_timeout()
case and connection errors that I was seeing in that PR.
How can the tonic server be configured to allow PERMIT_KEEPALIVE_WITHOUT_CALLS? Currently I get RESOURCE_EXHAUSTED: HTTP/2 error code: ENHANCE_YOUR_CALM (Bandwidth exhausted) when I configure my client with keepAliveWithoutCalls = true
Feature Request
Hello, the GRPC has it's own way of doing keepalive, besides you could use the
SO_KEEPALIVE
flag on the TCP socket. Currently, the GRPCHTTP 2
keepalive is not implemented. The ping part of the protocol can be found here. Here is the documentation of the GRPC part on this.Motivation
Without keepalive, long
idle
connections will be dropped, especially if you have, for instance, a network load balancer that doesn't know the protocol used.Proposal
Implement the GRPC oficial keepalive solution using HTTP 2 ping frames. The server needs to schedule those ping commands and handle the ACK (and the client).
GRPC - https://github.com/grpc/grpc/blob/master/doc/keepalive.md HTTP 2 - https://http2.github.io/http2-spec/#PING
From the GRPC FAC: