grpc / grpc-go

The Go language implementation of gRPC. HTTP/2 based RPC
https://grpc.io
Apache License 2.0
21.11k stars 4.39k forks source link

grpc client not detecting when the server is restarted. #7688

Closed Shaffath closed 1 month ago

Shaffath commented 1 month ago

I have a grpc client code which dial and connect to server, using a custom dialer.

Client code:

 opts := []grpc_retry.CallOption{
          grpc_retry.WithBackoff(grpc_retry.BackoffLinear(500 * time.Millisecond)),
  }

    timeout := grpc.WithTimeout(5 * time.Second)
    conn, err := grpc.Dial(
            mxObj_Gobgp.hostipport,
            timeout,
            grpc.WithBlock(),
            grpc.WithInsecure(),
            grpc.WithDialer(mxObj_Gobgp.customDialer),
          grpc.WithUnaryInterceptor(grpc_retry.UnaryClientInterceptor(opts...)),
    )

The role of customdialer is to replay flows to grpc server when the server is connected. func (mxObj_Gobgp MxMgr_Gobgp) customDialer(addr string, timeout time.Duration) (net.Conn, error) { network = "tcp" timeout = 3 time.Second addr = hostipport

    conn, err := net.DialTimeout(network, addr, timeout)
    if err != nil {
            logger.Log.Info("Retrying connection to server")
            return nil, err
    } else {
            go doReplay() // Connection with server is established Start replay
            return conn, err
    }

}

The code was working fine with go1.12.

Now the issue is when i upgraded my go to go1.23, the grpc client is not detecting when the server is restarted. Technically when the server is restarted, the customdialer should have called and triggered "go doReplay()".

What am i missing in go1.23, is there any change in underlying grpc concept ?

dfawley commented 1 month ago

What version of gRPC-Go are you using? Did you change it when upgrading Go? We haven't supported Go 1.12 in quite some time (several years), so the version of gRPC-Go you were using before isn't something we'd be able to provide support for anymore.

Shaffath commented 1 month ago

@dfawley First of all, Thank you for your response. i am not sure what version of grpc i was using when i was compiling with go1.12, but the glide was locked with the below version of grpc (https://github.com/grpc/grpc-go/commits/18957c5fcde0c3037144153d3db03756542007e5) But the moment i migrated to go1.23, in the go.mod file i have updated the grpc to v1.65.0.

I am not asking for any support in older version of grpc. My request is, i have some feature which was working fine with older version of grpc, but the feature got broken when i started using v1.65 version of grpc. I just wanted help to know what can be done to address the issue.

As mentioned in the code, when there is a server restart case, the client detects it in real-time and send "Replay" messages once the connection is re-established. But with latest version its not working. i just inputs like how to detect server restart cases in latest version of grpc, i wanted to use a customdialer function to have some additional functions to be executed when ever the connection is established.

dfawley commented 1 month ago

I'm not fully following what you're doing in your code. Can you maybe try to explain what you are trying to accomplish? gRPC already detects when connections are lost and re-establishes the connection. You should be able to put retrying logic around the RPCs themselves, or in an interceptor, or use gRPC's built-in retry functionality, depending on your needs.

Shaffath commented 1 month ago

Here is what the code is trying to accomplish.

  1. The grpc client is connecting to server using a customdialer function.
  2. when the server is connected it triggers a function called "Replay()" to replay all the flows the client has to the server.
  3. when the server is restarted/rebooted or the connection between the client and server is flapped, the client detects it via the underlying grpc protocol and triggers the customdialer to reconnect and "Replay()" once the connection is established.

This logic was working fine with go1.12 with old version of grpc.

Now when i have upgraded to go1.23 with grpc v1.65.0 version, here is the problem i am facing.

  1. when the server is restarted/rebooted the custom dialer is not getting triggered automatically , which means the grpc is not detecting the failure state of server and never tried to re-establish connection.
  2. As you can see the code is already integrated with retry logic, which will retry to establish connection when the server is up, But with latest code, that does not seem to happen.

So what ever you mentioned about grpc retry , grpc interceptor are already in placed in our code logic and its not helping. i wanted to have a logic where the client should detect the server restart cases and trigger a call back.

janardhanvissa commented 1 month ago

Enable keepalive to make sure the client detects server restarts. Monitor connection state using conn.GetState() to handle connection transitions manually. Switch to grpc.WithContextDialer if you're still using grpc.WithDialer, as it's deprecated.

dfawley commented 1 month ago

In addition, I'd recommend checking out:

grpc is not detecting the failure state of server and never tried to re-establish connection.

If keepalive doesn't fix it, then if you can provide more info here and share a reproducible test that demonstrates what you're seeing, that would be helpful.

Generally speaking, we're going to need to see some code, since the things you're doing are unusual and it's unclear from your description what is happening, e.g.:

when the server is connected it triggers a function called "Replay()" to replay all the flows the client has to the server.

  • What does that Replay thing do? the client detects it via the underlying grpc protocol
  • How?

But, please share code, since I don't think we'll be able to help without it.

Shaffath commented 1 month ago

will check and implement based on that.