grpc / grpc-dotnet

gRPC for .NET
Apache License 2.0
4.17k stars 769 forks source link

Return full stacktrace on service startup failures #1839

Closed cove closed 2 years ago

cove commented 2 years ago

Is your feature request related to a problem? Please describe.

I'd like to see if service stacktraces can be bubbled up during service startup.

While writing integration tests for a service the gRPC call would fail with an unknown error and the detailed error messages and adding log interceptors didn't make it any clearer to me where the issue was. The root cause was IConfiguration wasn't being passed into my service on startup during the test so some values were null and causing the service to not start up, but if the stackstrace was easily available, without having to debug the pkg, it would have reduce the amount of code to look though for the cause.

Describe the solution you'd like

Specifically this call was discarding the stacktrace during startup due to the finally, could we catch the exception and bubble it up here?

        private async Task<TResponse> ResolvedInterceptorInvoker(TRequest resolvedRequest, ServerCallContext resolvedContext)
        {
            GrpcActivatorHandle<TService> serviceHandle = default;
            try
            {
                serviceHandle = ServiceActivator.Create(resolvedContext.GetHttpContext().RequestServices);
                return await _invoker(serviceHandle.Instance, resolvedRequest, resolvedContext);
            }
            finally
            {
                if (serviceHandle.Instance != null)
                {
                    await ServiceActivator.ReleaseAsync(serviceHandle);
                }
            }
        }

Describe alternatives you've considered

Log interceptors weren't that helpful since it doesn't tell you which code the error came from.

Additional context

It would also be nice if IConfiguration was injected during the test, but that's likely a different ticket.

JamesNK commented 2 years ago

A finally doesn't catch the exception. It's rethrown automatically after the finally is complete. However, if a new error is thrown in the finally, I think that could replace the original exception.

Do you have an example of this happening? I don't understand what you're doing or how IConfiguration comes into the picture.

cove commented 2 years ago

I have a test that looks like this:

        [Test]
        public async Task LoginTest()
        {
            var client = new SesOnline.Flatbuf.AuthService.AuthServiceClient(Channel);

            var response =  await client.Login(new LoginRequest
            {
                ClientCredentialType = ClientCredentialType.DebugTrustClient,
                ClientCredential = "test cred",
                RequestedCredentialType = ServiceCredentialType.JWT
            });

            response.CredentialType.Should().Be(ServiceCredentialType.JWT);
        }

When I run it, I get the following error:

Grpc.Core.RpcException : Status(StatusCode="Unknown", Detail="Exception was thrown by handler. ArgumentNullException: Value cannot be null. (Parameter 's')")
   at Tests.IT.IntegrationTests.LoginTest() in D:\SES\cove_photon_frontend_test_refactoring_835\Orleans\Tests\IT\IntegrationTests.cs:line 88
   at NUnit.Framework.Internal.TaskAwaitAdapter.GenericAdapter`1.BlockUntilCompleted()
   at NUnit.Framework.Internal.MessagePumpStrategy.NoMessagePumpStrategy.WaitForCompletion(AwaitAdapter awaiter)
   at NUnit.Framework.Internal.AsyncToSyncAdapter.Await(Func`1 invoke)
   at NUnit.Framework.Internal.Commands.TestMethodCommand.RunTestMethod(TestExecutionContext context)
   at NUnit.Framework.Internal.Commands.TestMethodCommand.Execute(TestExecutionContext context)
   at NUnit.Framework.Internal.Commands.BeforeAndAfterTestCommand.<>c__DisplayClass1_0.<Execute>b__0()
   at NUnit.Framework.Internal.Commands.DelegatingTestCommand.RunTestMethodInThreadAbortSafeZone(TestExecutionContext context, Action action)

I wasn't sure what Parameter 's' was or where it was coming from, but setting a BP before ResolvedInterceptorInvoker() exited I could get the exception. Right so not the finally, but I guess the callee UnaryServerHandler() could have logged it it looks like. But the exception had enough information to narrow down where the error was coming from better:

System.ArgumentNullException: Value cannot be null. (Parameter 's')
   at System.Int32.Parse(String s)
   at Frontend.Services.AuthService..ctor(ILogger`1 logger, ClusterClientService clusterService, StreamAdapterService streamAdapter, IConfiguration config) in D:\SES\cove_photon_frontend_test_refactoring_835\Orleans\Frontend\Services\AuthService.cs:line 51
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Span`1& arguments, Signature sig, Boolean constructor, Boolean wrapExceptions)
   at System.Reflection.RuntimeConstructorInfo.Invoke(BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitConstructor(ConstructorCallSite constructorCallSite, RuntimeResolverContext context)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSiteMain(ServiceCallSite callSite, TArgument argument)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitRootCache(ServiceCallSite callSite, RuntimeResolverContext context)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSite(ServiceCallSite callSite, TArgument argument)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.Resolve(ServiceCallSite callSite, ServiceProviderEngineScope scope)
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateServiceAccessor(Type serviceType)
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey key, Func`2 valueFactory)
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(Type serviceType, ServiceProviderEngineScope serviceProviderEngineScope)
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.ServiceProviderEngineScope.GetService(Type serviceType)
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetService[T](IServiceProvider provider)
   at Grpc.AspNetCore.Server.Internal.DefaultGrpcServiceActivator`1.Create(IServiceProvider serviceProvider)
   at Grpc.Shared.Server.UnaryServerMethodInvoker`3.ResolvedInterceptorInvoker(TRequest resolvedRequest, ServerCallContext resolvedContext)
   at Grpc.Shared.Server.UnaryServerMethodInvoker`3.ResolvedInterceptorInvoker(TRequest resolvedRequest, ServerCallContext resolvedContext)

Looking into the constructor, although the line numbers weren't there, there's only one Int32.Parse(), and it was getting a null from the injected IConfiguration:


        public AuthService(ILogger<AuthService> logger, ClusterClientService clusterService,
            StreamAdapterService streamAdapter, IConfiguration config) : base(_logger, _clusterService, _streamAdapter, _config)
        {
            _logger = logger;
            _clusterService = clusterService;
            _streamAdapter = streamAdapter;
            _config = config;

            // temp work around
            // _config["Jwt:RsaPrivateKeyPem"] = "x";
            // _config["Jwt:Issuer"] = "x";
            // _config["Jwt:Audience"] = "x";
            // _config["Jwt:RolesPlayer"] = "player";
            // _config["Jwt:RolesAdmin"] = "admin player";
            // _config["Jwt:TokenExpireTtl"] = "3600";
            // _config["Jwt:TokenRefreshExpireTtl"] = "604800";

            _tokenExpireTtl = Int32.Parse(_config["Jwt:TokenExpireTtl"]);
            _jwtSecurityTokenHandler = new JwtSecurityTokenHandler();

            RSACryptoServiceProvider rsaCryptoServiceProvider = new RSACryptoServiceProvider();
            rsaCryptoServiceProvider.ImportFromPem(_config["Jwt:RsaPrivateKeyPem"]);

            _rsaSecurityKey = new RsaSecurityKey(rsaCryptoServiceProvider);
        }

Actually it could be IConfiguration was injected, but the values weren't loaded from appsettings.json or appsettings.Development.json, haven't checked yet. The appsettings.json is in the Test project folder.

Thoughts?

JamesNK commented 2 years ago

Ok, thanks. Your error comes from making a gRPC call, which is then sent across HTTP to the client. Stack traces are included when errors are serialized because the stack trace could easily exceed the maximum size of a HTTP header (8 KB?)

To get complete information about an error you should configure server logging.

cove commented 2 years ago

I do have server Grpc debug logging enabled, and I have multiple interceptors to improve the logging server side. But all I was getting was the generic error Error when executing service method 'Login'. that gets bubbled up from HandleCallAsync():

e.g. server side

0.035s Grpc.Net.Client.Internal.GrpcCall - Debug: Starting gRPC call. Method type: 'Unary', URI: 'http://localhost/SesOnline.Flatbuf.AuthService/Login'.
0.050s Grpc.Net.Client.Balancer.Subchannel - Debug: Subchannel id '1' created with addresses: localhost:80
0.054s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Channel picker updated.
0.055s Grpc.Net.Client.Balancer.Subchannel - Trace: Subchannel id '1' connection requested.
0.059s Grpc.Net.Client.Balancer.Subchannel - Debug: Subchannel id '1' state changed to Connecting. Detail: 'Connection requested.'.
0.060s Grpc.Net.Client.Balancer.PickFirstBalancer - Trace: Processing subchannel id '1' state changed to Connecting. Detail: 'Connection requested.'.
0.061s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Channel state updated to Connecting.
0.065s Grpc.Net.Client.Balancer.Subchannel - Trace: Subchannel id '1' connecting to transport.
0.065s Grpc.Net.Client.Balancer.Subchannel - Debug: Subchannel id '1' state changed to Ready. Detail: 'Passively connected.'.
0.065s Grpc.Net.Client.Balancer.PickFirstBalancer - Trace: Processing subchannel id '1' state changed to Ready. Detail: 'Passively connected.'.
0.065s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Channel state updated to Ready.
0.065s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Channel picker updated.
0.069s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Trace: Pick started.
0.070s Grpc.Net.Client.Balancer.Internal.ConnectionManager - Debug: Successfully picked subchannel id '1' with address localhost:80.
0.082s Grpc.Net.Client.Internal.GrpcCall - Debug: Sending message.
0.091s Grpc.Net.Client.Internal.GrpcCall - Trace: Serialized 'SesOnline.Flatbuf.LoginRequest' to 42 byte message.
0.091s Grpc.Net.Client.Internal.GrpcCall - Trace: Message sent.
0.106s Microsoft.AspNetCore.Hosting.Diagnostics - Information: Request starting HTTP/2 POST http://localhost/SesOnline.Flatbuf.AuthService/Login application/grpc -
0.107s Microsoft.AspNetCore.HostFiltering.HostFilteringMiddleware - Debug: Wildcard detected, all requests with hosts will be allowed.
0.107s Microsoft.AspNetCore.HostFiltering.HostFilteringMiddleware - Trace: All hosts are allowed.
0.138s Microsoft.AspNetCore.Routing.Matching.DfaMatcher - Debug: 3 candidate(s) found for the request path '/SesOnline.Flatbuf.AuthService/Login'
0.140s Microsoft.AspNetCore.Routing.Matching.DfaMatcher - Debug: Endpoint 'gRPC - /SesOnline.Flatbuf.AuthService/Login' with route pattern '/SesOnline.Flatbuf.AuthService/Login' is valid for the request path '/SesOnline.Flatbuf.AuthService/Login'
0.140s Microsoft.AspNetCore.Routing.Matching.DfaMatcher - Debug: Endpoint 'gRPC - Unimplemented method for SesOnline.Flatbuf.AuthService' with route pattern 'SesOnline.Flatbuf.AuthService/{unimplementedMethod}' is valid for the request path '/SesOnline.Flatbuf.AuthService/Login'
0.140s Microsoft.AspNetCore.Routing.Matching.DfaMatcher - Debug: Endpoint 'gRPC - Unimplemented service' with route pattern '{unimplementedService}/{unimplementedMethod}' is valid for the request path '/SesOnline.Flatbuf.AuthService/Login'
0.140s Microsoft.AspNetCore.Routing.EndpointRoutingMiddleware - Debug: Request matched endpoint 'gRPC - /SesOnline.Flatbuf.AuthService/Login'
0.160s Microsoft.AspNetCore.Authentication.JwtBearer.JwtBearerHandler - Debug: AuthenticationScheme: Bearer was not authenticated.
0.162s Microsoft.AspNetCore.Routing.EndpointMiddleware - Information: Executing endpoint 'gRPC - /SesOnline.Flatbuf.AuthService/Login'
0.171s Grpc.AspNetCore.Server.ServerCallHandler - Debug: Reading message.
0.174s Grpc.AspNetCore.Server.ServerCallHandler - Trace: Deserializing 42 byte message to 'SesOnline.Flatbuf.LoginRequest'.
0.179s Grpc.AspNetCore.Server.ServerCallHandler - Trace: Received message.
0.186s Tests.IT.Helpers.ServerLoggerInterceptor - Error: Error thrown by /SesOnline.Flatbuf.AuthService/Login.
0.189s Grpc.AspNetCore.Server.ServerCallHandler - Error: Error when executing service method 'Login'.
0.191s Microsoft.AspNetCore.Routing.EndpointMiddleware - Information: Executed endpoint 'gRPC - /SesOnline.Flatbuf.AuthService/Login'
0.196s Microsoft.AspNetCore.Hosting.Diagnostics - Information: Request finished HTTP/2 POST http://localhost/SesOnline.Flatbuf.AuthService/Login application/grpc - - 200 - application/grpc 90.6389ms
0.197s Grpc.Net.Client.Internal.GrpcCall - Trace: Response headers received.
0.201s Grpc.Net.Client.Internal.GrpcCall - Information: Call failed with gRPC error status. Status code: 'Unknown', Message: 'Exception was thrown by handler. ArgumentNullException: Value cannot be null. (Parameter 's')'.
0.201s Grpc.Net.Client.Internal.GrpcCall - Debug: Finished gRPC call.
JamesNK commented 2 years ago

From your logs:

0.189s Grpc.AspNetCore.Server.ServerCallHandler - Error: Error when executing service method 'Login'.

This should be writing out the exception message. The exception message is passed to it:

https://github.com/grpc/grpc-dotnet/blob/bfe312a3baeea3dc3676b3770dedd1b927e42fb3/src/Grpc.AspNetCore.Server/Internal/HttpContextServerCallContext.cs#L199

I'm guessing the way you've configured console logging is not including the exception message.

cove commented 2 years ago

Oh I see, I just copied the sample code verbatim and didn't look too closely at it, so that's where it's getting dropped then:

       private void WriteMessage(LogLevel logLevel, string category, EventId eventId, string message, Exception? exception)
        {
            // Log using the passed in execution context.
            // In the case of NUnit, console output is only captured by the test
            // if it is written in the test's execution context.
            ExecutionContext.Run(_executionContext, s =>
            {
                Console.WriteLine($"{_stopwatch.Elapsed.TotalSeconds:N3}s {category} - {logLevel}: {message}");
            }, null);
        }

Also any thoughts on why the IConfiguration isn't loading values from appsettings.json in the test?

JamesNK commented 2 years ago

I improved the example here - https://github.com/grpc/grpc-dotnet/pull/1841