grpc / grpc-dotnet

gRPC for .NET
Apache License 2.0
4.2k stars 769 forks source link

“loadbalancer.temporary.invalid” cause connect error #2241

Closed someview closed 1 year ago

someview commented 1 year ago

version:

grpc-dotnet: 2.5.5

steps

  1. create clientfactory
  2. use client facotry create grpc clientAn use clientA to dial a no exists service like friendservice
  3. the other client from the same factory can not send other grpc requests .

    desc

    dbug: 2023-08-11 11:56:33Z Grpc.Net.Client.Internal.GrpcCall[1] Starting gRPC call. Method type: 'Unary', URI: 'http://loadbalancer.temporary.invalid/JD.Friend.Contract.FriendApplyWave ││ dbug: 2023-08-11 11:56:33Z Grpc.Net.Client.Balancer.Internal.ConnectionManager[9] Picked failure with status: Status(StatusCode="Unavailable", Detail="Resolver returned no addresses.") ││ fail: 2023-08-11 11:56:33Z Grpc.Net.Client.Internal.GrpcCall[6] Error starting gRPC call.                                                                                                ││ Grpc.Core.RpcException: Status(StatusCode="Unavailable", Detail="Resolver returned no addresses.")                                                                                       ││    at Grpc.Net.Client.Balancer.Internal.ConnectionManager.PickAsync(PickContext context, Boolean waitForReady, CancellationToken cancellationToken)                                      ││    at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)                                                   ││    at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout)                                                                                        ││ info: 2023-08-11 11:56:33Z Grpc.Net.Client.Internal.GrpcCall[3] Call failed with gRPC error status. Status code: 'Unavailable', Message: 'Resolver returned no addresses.'.              ││ dbug: 2023-08-11 11:56:33Z Grpc.Net.Client.Internal.GrpcCall[4] Finished gRPC call.                                                                                                      ││ dbug: 2023-08-11 11:56:33Z Grpc.Net.Client.Internal.GrpcCall[8] gRPC call canceled.                                                                                                      ││ fail: 2023-08-11 11:56:33Z TL.BaseApp.Handlers.GlobalExceptionHandler[9000] Exception:RpcException Message:Status(StatusCode="Unavailable", Detail="Resolver returned no addresses.") Po ││ Grpc.Core.RpcException: Status(StatusCode="Unavailable", Detail="Resolver returned no addresses.")                                                                                       ││    at Grpc.Net.Client.Balancer.Internal.ConnectionManager.PickAsync(PickContext context, Boolean waitForReady, CancellationToken cancellationToken)                                      ││    at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)                                                   ││    at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout)                                                                                        ││    at TL.Grpc.GrpcClientInterceptor.HandleResponse[TResponse](Task`1 t, ISpan span)                                                                                                      ││    at ProtoBuf.Grpc.Internal.Reshape.UnaryTaskAsyncImpl[TRequest,TResponse](AsyncUnaryCall`1 call, MetadataContext metadata, CancellationToken cancellationToken) in /_/src/protobuf-net ││    at TL.SysRun.RHelper.ThrowWhenErrorAsync[TCode](Task`1 rTask)                                                                                                                         ││    at JD.WaveService.ControllerBusiness.Implement.WaveFriendApplyControllerBusiness.CbAddFriendApply(GwReqAddFriendApply body, Int64 userId) in /home/jenkins/agent/workspace/后端更新依 ││    at lambda_method219(Closure, Object)                                                                                                                                                  ││    at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.AwaitableObjectResultExecutor.Execute(ActionContext actionContext, IActionResultTypeMapper mapper, ObjectMethodExecut ││    at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeActionMethodAsync>g__Awaited|12_0(ControllerActionInvoker invoker, ValueTask`1 actionResultValueTask)       ││    at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeNextActionFilterAsync>g__Awaited|10_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope sco ││    at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context)                                                                       ││    at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)                                               ││    at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeInnerFilterAsync>g__Awaited|13_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, O

    This occurs when a first service not exist

someview commented 1 year ago

I have traced the code,and find some graph in grpcChannel class:

#if NET5_0_OR_GREATER
    private static readonly Uri HttpLoadBalancerTemporaryUri = new Uri("http://loadbalancer.temporary.invalid");
    private static readonly Uri HttpsLoadBalancerTemporaryUri = new Uri("https://loadbalancer.temporary.invalid");

    private static bool IsProxied(SocketsHttpHandler socketsHttpHandler, Uri address, bool isSecure)
    {
        // Check standard address directly.
        // When load balancing the channel doesn't know the final addresses yet so use temporary address.
        Uri resolvedAddress;
        if (IsHttpOrHttpsAddress(address))
        {
            resolvedAddress = address;
        }
        else if (isSecure)
        {
            resolvedAddress = HttpsLoadBalancerTemporaryUri;
        }
        else
        {
            resolvedAddress = HttpLoadBalancerTemporaryUri;
        }

        var proxy = socketsHttpHandler.Proxy ?? HttpClient.DefaultProxy;
        return proxy.GetProxy(resolvedAddress) != null;
    }
#endif

when create grpcChannel like this:

 channel = GrpcChannel.ForAddress(address, new GrpcChannelOptions
                            {

                                HttpHandler = new SocketsHttpHandler
                                {
                                    EnableMultipleHttp2Connections = true,
                                    PooledConnectionIdleTimeout = TimeSpan.FromSeconds(60),
                                    ConnectTimeout = TimeSpan.FromSeconds(5),  
                                },
                                Credentials = credentials,   
                                ServiceConfig = new ServiceConfig
                                {
                                    LoadBalancingConfigs = {
                                        new BalanceConfig(opt.BalancerName)
                                    }
                                },
                                LoggerFactory = _loggerFactory,
                                ServiceProvider = services.BuildServiceProvider(),
                            });

var proxy = socketsHttpHandler.Proxy ?? HttpClient.DefaultProxy the conditon may cause an unexpected result if HttpClient.DefaultProxy is not setted correctly.And this may casue the situition I have mentioned above: the channel treat address like "grpc:///roomservice:5002" as proxy url

dbug: 2023-08-14 01:55:39Z Grpc.Net.Client.Internal.GrpcCall[1] Starting gRPC call. Method type: 'Unary', URI: 'http://loadbalancer.temporary.invalid/TL.Count.Contract.Cou │
│ dbug: 2023-08-14 01:55:39Z Grpc.Net.Client.Balancer.Internal.ConnectionManager[6] Successfully picked subchannel id '1' with address 100.105.97.197:5002.

there may exist a concurrent issue: when first client has no avaliable address, they may affect another.

someview commented 1 year ago

I have discovered the truth。

    private GrpcMethodInfo CreateMethodInfo(IMethod method)
    {
        var uri = new Uri(method.FullName, UriKind.Relative);
        var scope = new GrpcCallScope(method.Type, uri);
        var methodConfig = ResolveMethodConfig(method);

        var uriBuilder = new UriBuilder(Address);
        uriBuilder.Path = method.FullName;

        // The Uri used to create HttpRequestMessage must have a http or https scheme.
        uriBuilder.Scheme = IsSecure ? Uri.UriSchemeHttps : Uri.UriSchemeHttp;

        // A Uri with a http or https scheme requires a host name.
        // Triple slash URIs, e.g. dns:///custom-value, won't have a host and UriBuilder throws an error.
        // Add a temp value as the host. The tempuri.org host may show up in some logging but it will
        // get replaced in the final HTTP request address by the load balancer.
        if (string.IsNullOrEmpty(uriBuilder.Host))
        {
            // .invalid is reserved for temporary host names.
            // https://datatracker.ietf.org/doc/html/rfc2606#section-2
            uriBuilder.Host = "loadbalancer.temporary.invalid";
        }

        return new GrpcMethodInfo(scope, uriBuilder.Uri, methodConfig);
    }

The grpc call is ok, but the logger record is strange. http://loadbalancer.temporary.invalid/TL.Count.Contract.CountContract/getCounts even there is a normal address.