Particular / NServiceBus

Build, version, and monitor better microservices with the most powerful service platform for .NET
https://particular.net/nservicebus/
Other
2.07k stars 651 forks source link

Stacktrace on failed messages is incomplete #2248

Closed warrenrumak closed 9 years ago

warrenrumak commented 9 years ago

After upgrading from 4.4.2, all we get for a stack trace when an exception happens in a message handler is:

It's pretty clear that this is a consequence of the badly-judged implementation of avoiding deeply-nested stack traces (d4795bd99164d29a7559b2826e4eed5451418678). I also see that further changes were made in this area a few weeks later to avoid the bad practice of doing catch (Exception e) { throw e; } but that code has not made it into a 4.6 branch release. It would be nice if you did -- my only alternative right now seems to be to wrap all handlers with a try/catch block and do my own error logging.

SimonCropp commented 9 years ago

@warrenrumak having trouble reproducing this one

I wrote a handler that does this

class MyHandler : IHandleMessages<MyMessage>
{
    public void Handle(MyMessage message)
    {
        Foo();
    }

    static void Foo()
    {
        Bar();
    }

    static void Bar()
    {
        throw new Exception("sdfsdf");
    }
}

And the stack trace I got was this

System.Exception: sdfsdf
   at MyHandler.Bar() in c:\Code\ConsoleApplication2\ConsoleApplication2\ProgramService.cs:line 80
   at MyHandler.Foo() in c:\Code\ConsoleApplication2\ConsoleApplication2\ProgramService.cs:line 75
   at MyHandler.Handle(MyMessage message) in c:\Code\ConsoleApplication2\ConsoleApplication2\ProgramService.cs:line 70
   at NServiceBus.Unicast.HandlerInvocationCache.Invoke(Object handler, Object message, Dictionary`2 dictionary) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Unicast\HandlerInvocationCache.cs:line 63
   at NServiceBus.Unicast.Behaviors.InvokeHandlersBehavior.Invoke(HandlerInvocationContext context, Action next) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Unicast\Behaviors\InvokeHandlersBehavior.cs:line 29
   at NServiceBus.Pipeline.BehaviorChain`1.InvokeNext(T context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\BehaviorChain.cs:line 51   at NServiceBus.Pipeline.BehaviorChain`1.Invoke(T context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\BehaviorChain.cs:line 34
   at NServiceBus.Pipeline.PipelineExecutor.Execute[T](BehaviorChain`1 pipelineAction, T context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\PipelineExecutor.cs:line 139
   at NServiceBus.Unicast.Behaviors.LoadHandlersBehavior.Invoke(ReceiveLogicalMessageContext context, Action next) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Unicast\Behaviors\LoadHandlersBehavior.cs:line 37
   at NServiceBus.Pipeline.BehaviorChain`1.InvokeNext(T context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\BehaviorChain.cs:line 51   at NServiceBus.Pipeline.BehaviorChain`1.Invoke(T context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\BehaviorChain.cs:line 34
   at NServiceBus.Pipeline.PipelineExecutor.Execute[T](BehaviorChain`1 pipelineAction, T context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\PipelineExecutor.cs:line 139
   at NServiceBus.Unicast.Messages.ExecuteLogicalMessagesBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Unicast\Messages\ExecuteLogicalMessagesBehavior.cs:line 28
   at NServiceBus.Pipeline.BehaviorChain`1.InvokeNext(T context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\BehaviorChain.cs:line 51   at NServiceBus.Pipeline.BehaviorChain`1.InvokeNext(T context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\BehaviorChain.cs:line 62
   at NServiceBus.UnitOfWork.UnitOfWorkBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\UnitOfWork\UnitOfWorkBehavior.cs:line 24

Which seems correct.

Can you provide a way to reproduce your issue?

Where are you looking at the stack trace? In the debugger, in the log file or elsewhere?

Just to confirm you are using 4.6.3?

Wwhat change are you referring to here

I also see that further changes were made in this area a few weeks later to avoid the bad practice of doing catch (Exception e) { throw e; } but that code has not made it into a 4.6 branch release

mlidbom commented 9 years ago

We're seeing a similar problem. I'm running 4.6.3

Obviously your stack trace mutilation strategy is failing to handle one or more edge cases.

This is the full call stack we get:

Composable.SystemExtensions.Threading.MultiThreadedUseException: Attempt to use 7d052729-119f-4562-8588-58c1962feab0: Manpower.Applications.ReferenceDataModule.ViewModels.Services.ViewModelsSession from thread Id:59, Name:  when owning thread was Id: 20 Name: 
   at NServiceBus.Pipeline.BehaviorChain`1.InvokeNext(T context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\BehaviorChain.cs:line 62
   at NServiceBus.UnitOfWork.UnitOfWorkBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\UnitOfWork\UnitOfWorkBehavior.cs:line 24

Every single bit of information about the involvement of our code is stripped from that exception. It is extremely hard to try to debug this without a sane stack trace.

We've been running our own fork of NServiceBus 3 for years because of several show stopper bugs in NServiceBus 3. Mutilated stack traces is one of those and is still here after years and several bugs reported and "fixed". Will I have to create and maintain a fork of NServiceBus 4 as well?

All parts of a stack trace is essential information. I need to know how NServiceBus interacts with my message in order to fully understand what is happening when an exception is thrown. Removing essential information does not make things easier for anyone. Modifying stack traces in any way is lying to the poor developer that needs to fix the bug. I can't for the life of me understand how you can choose to intentionally mutilate stacks traces.

Please completely stop trying to "help" by lying through mutilating exceptions. You have been trying to get this working acceptably for years and it still does not. Can you really find a reasonable motivation for deviating from the standard recommended way of handling exceptions?

One simple rule is all you need to fix this: never ever re-throw an exception without wrapping it in another exception or using this exact line of code:

throw;
mlidbom commented 9 years ago

I just remembered a bit of code I saw years ago in NServiceBus so I need to add another rule:

Never ever do this:

throw new WrappingException(caughtException.InnerException);

This was done with the assumption that caughtException was of a certain type thrown from a specific line of code. There were a number of edge cases where it was not and this did not have the expected effect.

Either way this is a terrible idea in my opinion. I never want to be lied to about the call stack at the time an exception is thrown. When I fix a bug I need to know what happened, not some "improved" version of reality.

mlidbom commented 9 years ago

OK searched the source using this regular expression: throw [^n]

Found these bits of code in BehaviorChain.cs:

            catch (Exception exception)
            {
                // ReSharper disable once PossibleIntendedRethrow
                // need to rethrow explicit exception to preserve the stack trace
                throw exception;
            }
            catch (Exception exception)
            {
                if (!stackTracePreserved)
                {
                    exception.PreserveStackTrace();
                }
                stackTracePreserved = true;
                // ReSharper disable once PossibleIntendedRethrow
                // need to rethrow explicit exception to preserve the stack trace
                throw exception;
            }

Along with the mysterious class StackTracePreserver with this code:

 public static void PreserveStackTrace(this Exception exception)
        {
            if (!exception.GetType().IsSerializable)
            {
                return;
            }
            try
            {
                var context = new StreamingContext(StreamingContextStates.CrossAppDomain);
                var objectManager = new ObjectManager(null, context);
                var serializationInfo = new SerializationInfo(exception.GetType(), new FormatterConverter());

                exception.GetObjectData(serializationInfo, context);
                objectManager.RegisterObject(exception, 1, serializationInfo); // prepare for SetObjectData
                objectManager.DoFixups(); // ObjectManager calls SetObjectData
            }
            // ReSharper disable once EmptyGeneralCatchClause
            catch (Exception)
            {
                //this is a best effort. if we fail to patch the stack trace just let it go
            }
        }

Those throw exception; calls are pretty much guaranteed to be the cause of my problem.

What in the world are you trying to achieve here that would not just work out of the box if you just wrapped the exception or used throw; ?

I'm not clean on how PreserveStackTrace is supposed to work, but it looks to me like it will completely fail to do its job if the exception is not serializable. But the code that uses it does not in any way take this into account. Does that mean that this code is guaranteed to totally corrupt the call stack for all thrown exceptions that are not serializable? In other words the great majority of custom exceptions?

That would explain why your test using the built in and serializable Exception class works.

SimonCropp commented 9 years ago

I can confirm this

class MyHandler:IHandleMessages<MyMessage>
{
    public void Handle(MyMessage message)
    {
        ClassThaThrows.MethodThatThrows();
    }
}

class ClassThaThrows
{
    public static void MethodThatThrows()
    {
        MethodThatThrows2();
    }

    static void MethodThatThrows2()
    {
        throw new MyException("Foo Bar");
    }
}

class MyException : Exception
{
    public MyException(string message)
        : base(message)
    {

    }
}

class MyMessage:IMessage
{
}

produces an unhelpful stack trace

StackTraceRepro.MyException: Foo Bar
   at NServiceBus.Pipeline.BehaviorChain`1.InvokeNext(T context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\BehaviorChain.cs:line 62
   at NServiceBus.Pipeline.BehaviorChain`1.<>c__DisplayClass2.<InvokeNext>b__0() in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\BehaviorChain.cs:line 51
   at NServiceBus.UnitOfWork.UnitOfWorkBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\UnitOfWork\UnitOfWorkBehavior.cs:line 24

Will look into further

SimonCropp commented 9 years ago

repro here https://github.com/Particular/BugsRepro/tree/master/StackTraceRepro

SimonCropp commented 9 years ago

issue does not exist in 5

StackTraceRepro.MyException: Foo Bar
   at StackTraceRepro.ClassThaThrows.MethodThatThrows2() in c:\Code\Particular\BugsRepro\StackTraceRepro\StackTraceRepro\MyHandler.cs:line 23
   at StackTraceRepro.ClassThaThrows.MethodThatThrows() in c:\Code\Particular\BugsRepro\StackTraceRepro\StackTraceRepro\MyHandler.cs:line 18
   at StackTraceRepro.MyHandler.Handle(MyMessage message) in c:\Code\Particular\BugsRepro\StackTraceRepro\StackTraceRepro\MyHandler.cs:line 10
   at lambda_method(Closure , Object , Object )
   at NServiceBus.Unicast.MessageHandlerRegistry.Invoke(Object handler, Object message, Dictionary`2 dictionary) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Unicast\MessageHandlerRegistry.cs:line 126
   at NServiceBus.Unicast.MessageHandlerRegistry.InvokeHandle(Object handler, Object message) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Unicast\MessageHandlerRegistry.cs:line 84
   at NServiceBus.Unicast.Behaviors.LoadHandlersBehavior.<Invoke>b__1(Object handlerInstance, Object message) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Unicast\Behaviors\LoadHandlersBehavior.cs:line 40
   at NServiceBus.Unicast.Behaviors.InvokeHandlersBehavior.Invoke(IncomingContext context, Action next) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Unicast\Behaviors\InvokeHandlersBehavior.cs:line 22
   at NServiceBus.Pipeline.BehaviorChain`1.InvokeNext(T context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\BehaviorChain.cs:line 55
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at NServiceBus.Pipeline.BehaviorChain`1.Invoke(T context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\BehaviorChain.cs:line 36
   at NServiceBus.Pipeline.PipelineExecutor.Execute[T](BehaviorChain`1 pipelineAction, T context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\PipelineExecutor.cs:line 128
   at NServiceBus.Pipeline.PipelineExecutor.InvokePipeline[TContext](IEnumerable`1 behaviors, TContext context) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\PipelineExecutor.cs:line 74
   at NServiceBus.Pipeline.PipelineExecutor.InvokeReceivePhysicalMessagePipeline() in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Pipeline\PipelineExecutor.cs:line 100
   at NServiceBus.Unicast.UnicastBus.TransportMessageReceived(Object sender, TransportMessageReceivedEventArgs e) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Unicast\UnicastBus.cs:line 815
   at NServiceBus.Unicast.Transport.TransportReceiver.OnTransportMessageReceived(TransportMessage msg) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Unicast\Transport\TransportReceiver.cs:line 423
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at NServiceBus.Unicast.Transport.TransportReceiver.ProcessMessage(TransportMessage message) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Unicast\Transport\TransportReceiver.cs:line 355
   at NServiceBus.Unicast.Transport.TransportReceiver.TryProcess(TransportMessage message) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Unicast\Transport\TransportReceiver.cs:line 228
   at NServiceBus.Transports.Msmq.MsmqDequeueStrategy.ProcessMessage(TransportMessage message) in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Transports\Msmq\MsmqDequeueStrategy.cs:line 281
   at NServiceBus.Transports.Msmq.MsmqDequeueStrategy.Action() in c:\BuildAgent\work\31f8c64a6e8a2d7c\src\NServiceBus.Core\Transports\Msmq\MsmqDequeueStrategy.cs:line 232
SimonCropp commented 9 years ago

@mlidbom so just to confirm can you make your exception serializable to check if it fixes the stack trace

SimonCropp commented 9 years ago

@mlidbom

We've been running our own fork of NServiceBus 3 for years because of several show stopper bugs in NServiceBus 3

What are the "show stopper" bugs in 3? 3.3 is still being actively maintained (there has been 3 patches in the past two months) so if you have bugs you feel are critical please highlight them

mlidbom commented 9 years ago

@SimonCropp I don't have the time to make that change and provoke the exception right now.

Same goes for digging into the history of our patches. To busy right now. So you're getting my vague memories that may not correspond to reality for now:

We extracted our fork a long time ago and I, apparently wrongly, assumed that those issues would have been fixed long ago when we recently upgraded to the canonical 4.6.3

Unless a version with this fixed is released very shortly indeed we will be forced down the fork road again. There is no way we can run something in production that does not provide a sane stack trace when something goes wrong.

SimonCropp commented 9 years ago

@mlidbom

There is no way we can run something in production that does not provide a sane stack trace when something goes wrong.

The reason for the stack trace manipulation is that due to the nested recursive nature of the NSB pipeline which can result in some very long stack traces. The majority of this stack trace is of no use to the user of NSB. With no stack trace manipulation this can result in very hard to diagnose issue due to a very poor signal to noise ratio. So unfortunately it is a necessary evil. We will look to fixing the edge cases you discovered in a patch to version 4

We extracted our fork a long time ago

Did you raise issues against the NSB repo for the things you had to fix? Or did you send pull requests?

we will be forced down the fork road again

This is obviously not ideal. It would be appreciated if you could highlight your concerns, as you have done in this issue, to they can be addressed.

mlidbom commented 9 years ago

Granted our handling of the problems last time around were not optimal. We did not create pull requests etc as we should have.

The reason for the stack trace manipulation is that due to the nested recursive nature of the NSB pipeline which can result in some very long stack traces. The majority of this stack trace is of no use to the user of NSB. With no stack trace manipulation this can result in very hard to diagnose issue due to a very poor signal to noise ratio. So unfortunately it is a necessary evil. We will look to fixing the edge cases you discovered in a patch to version 4

This is part of why we did not bother last time. Saying that I disagree with this sentiment is an understatement of the highest degree.

In other words, as long as NServiceBus does not let us get the stack traces without mutilation we will run a fork. There is no way I'm running code in production that is intentionally mutilating stack traces.

mlidbom commented 9 years ago

We'll I've started down the fork path since I don't see us seeing eye to eye on stack traces any time soon and I don't have much time at all here. First thing I run into is that ripple cannot restore Particular.Licensing.Sources

As is the first thing I will have to to to make it build at the moment is rip all licensing out on the fork. I really don't want to do this so if you could instruct me on how to get things to build without doing that it would be appreciated.

Here's the problem I'm seeing unless I remove the dependency on Particual.Licensing.Sources:

C:\NServiceBus>buildsupport\ripple.exe restore
Restoring dependencies for solution NServiceBus to C:\NServiceBus\src\packages
GitFlow conventional stability is enabled, stability will be set according to the current branch

==================================================================================
Missing Nugets: Could not find nugets
==================================================================================
     ** Nugets
        * Particular.Licensing.Sources     Float
     ** Problems
        * WebException     The remote server returned an error: (400) Bad Request.

ripple: Could not restore dependencies
ripple: Error executing DownloadMissingNugets

If I remove the dependency on Particular.Licensing.Sources restore works, but then I need to rip out all the licensing code to get things to build.

andreasohlund commented 9 years ago

The ripple issue has now been fixed. Can you pull the latest master and try again?

Cheers,

Andreas

On Fri, Sep 5, 2014 at 4:50 PM, Magnus Lidbom notifications@github.com wrote:

We'll I've started down the fork path since I don't see us seeing eye to eye on stack traces any time soon and I don't have much time at all here. First thing I run into is that ripple cannot restore Particular.Licensing.Sources

As is the first thing I will have to to to make it build at the moment is rip all licensing out on the fork. I really don't want to do this so if you could instruct me on how to get things to build without doing that it would be appreciated.

Here's the problem I'm seeing unless I remove the dependency on Particual.Licensing.Sources:

C:\NServiceBus>buildsupport\ripple.exe restore Restoring dependencies for solution NServiceBus to C:\NServiceBus\src\packages GitFlow conventional stability is enabled, stability will be set according to the current branch

Missing Nugets: Could not find nugets

 ** Nugets
    * Particular.Licensing.Sources     Float
 ** Problems
    * WebException     The remote server returned an error: (400) Bad Request.

ripple: Could not restore dependencies ripple: Error executing DownloadMissingNugets

If I remove the dependency on Particular.Licensing.Sources restore works, but then I need to rip out all the licensing code to get things to build.

— Reply to this email directly or view it on GitHub https://github.com/Particular/NServiceBus/issues/2248#issuecomment-54634752 .

mlidbom commented 9 years ago

Building works now. Thanks :)

mlidbom commented 9 years ago

Status update & more griping/criticism: I started by ripping out/corrected all the code I could find that mutilates stack traces.

Ironically, the test When_exception_is_thrown_in_a_release_build_stack_trace_is_trimmed failed because it thought the stack trace was now too short ;). Turns out that without all the code you added to mutilate the stack traces BehaviorChain.InvokeNext is inlined in a release build making the call stack shorter than the one you create with all this complexity.

After removing the mutilation code the stack trace was still not complete, nor very readable in its formatting. Turns out that the FaultManager was not recursing into InnerException to create its message, but rather just called ToString.

After I created a method to pull out a readable stack trace the below is what I get for the original Fault message. I Completely fail to understand what is supposed to be wrong with this stack trace. It gives an excellent idea of what is actually going on and fixing the bug will be trivial now that we can see what is happening:

   Exception:Composable.SystemExtensions.Threading.MultiThreadedUseException
   Message:Attempt to use 6964410c-6c56-4af1-ab35-71d5565bd855: Manpower.Applications.ReferenceDataModule.ViewModels.Services.ViewModelsSession from thread Id:27, Name:  when owning thread was Id: 23 Name: 

   at Composable.SystemExtensions.Threading.SingleThreadUseGuard.InternalAssertNoChangeOccurred(Object guarded)
   at Composable.SystemExtensions.Threading.UsageGuard.AssertNoContextChangeOccurred(Object guarded)
   at Composable.KeyValueStorage.DocumentDbSession.Get[TValue](Object key)
   at Composable.KeyValueStorage.DocumentDbSessionProxy.Get[TValue](Object key)
   at Manpower.Applications.ReferenceDataModule.ViewModels.Services.CachingReferenceDataViewModelsSession.Get[TValue](Object key)
   at Manpower.Applications.JobSearch.SharedUtils.Translation.WebsiteTranslationContextExtractor.&lt;GetTranslationContext&gt;b__0(UnNamedEntityReference`1 webSiteReference) in c:\BuildArtifacts\dev-temp\JobSearch\JobSearch.SharedUtils\Translation\WebsiteTranslationContextExtractor.cs:line 25
   at System.Linq.Enumerable.WhereSelectListIterator`2.MoveNext()
   at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at Manpower.Applications.JobSearch.SharedUtils.Translation.WebsiteTranslationContextExtractor.GetTranslationContext(IEnumerable`1 websiteReferences) in c:\BuildArtifacts\dev-temp\JobSearch\JobSearch.SharedUtils\Translation\WebsiteTranslationContextExtractor.cs:line 24
   at Manpower.Applications.JobSearch.SharedUtils.Translation.WebsiteSpecificEntityTranslator.TranslateForWebsites(Object entity, IEnumerable`1 websiteIds) in c:\BuildArtifacts\dev-temp\JobSearch\JobSearch.SharedUtils\Translation\WebsiteSpecificEntityTranslator.cs:line 27
   at Manpower.Applications.JobSearch.JobManagement.SearchModelUpdater.JobAdvertisementSearchModelUpdater.Handle(IJobEvent message) in c:\BuildArtifacts\dev-temp\JobSearch\JobSearch.JobManagement.SearchModelUpdater\JobManagementSearchModelUpdater.cs:line 50
   at NServiceBus.Unicast.HandlerInvocationCache.Invoke(Object handler, Object message, Dictionary`2 dictionary) in c:\NServiceBus\src\NServiceBus.Core\Unicast\HandlerInvocationCache.cs:line 63
   at NServiceBus.Unicast.Behaviors.InvokeHandlersBehavior.Invoke(HandlerInvocationContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Unicast\Behaviors\InvokeHandlersBehavior.cs:line 29
   at NServiceBus.Sagas.SagaPersistenceBehavior.Invoke(HandlerInvocationContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Sagas\SagaPersistenceBehavior.cs:line 32
   at NServiceBus.Sagas.AuditInvokedSagaBehavior.Invoke(HandlerInvocationContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Sagas\AuditInvokedSagaBehavior.cs:line 18
   at NServiceBus.Unicast.Behaviors.SetCurrentMessageBeingHandledBehavior.Invoke(HandlerInvocationContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Unicast\Behaviors\SetCurrentMessageBeingHandledBehavior.cs:line 21
   at NServiceBus.Pipeline.PipelineExecutor.Execute[T](BehaviorChain`1 pipelineAction, T context) in c:\NServiceBus\src\NServiceBus.Core\Pipeline\PipelineExecutor.cs:line 137
   at NServiceBus.Unicast.Behaviors.LoadHandlersBehavior.Invoke(ReceiveLogicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Unicast\Behaviors\LoadHandlersBehavior.cs:line 37
   at NServiceBus.DataBus.DataBusReceiveBehavior.Invoke(ReceiveLogicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\DataBus\DataBusReceiveBehavior.cs:line 68
   at NServiceBus.Pipeline.MessageMutator.ApplyIncomingMessageMutatorsBehavior.Invoke(ReceiveLogicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\MessageMutator\ApplyIncomingMessageMutatorsBehavior.cs:line 25
   at NServiceBus.Pipeline.PipelineExecutor.Execute[T](BehaviorChain`1 pipelineAction, T context) in c:\NServiceBus\src\NServiceBus.Core\Pipeline\PipelineExecutor.cs:line 137
   at NServiceBus.Unicast.Messages.ExecuteLogicalMessagesBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Unicast\Messages\ExecuteLogicalMessagesBehavior.cs:line 28
   at NServiceBus.Unicast.Behaviors.CallbackInvocationBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Unicast\Behaviors\CallbackInvocationBehavior.cs:line 26
   at NServiceBus.Unicast.Messages.ExtractLogicalMessagesBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Unicast\Messages\ExtractLogicalMessagesBehavior.cs:line 58
   at NServiceBus.Sagas.RemoveIncomingHeadersBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Sagas\RemoveIncomingHeadersBehavior.cs:line 23
   at NServiceBus.MessageMutator.ApplyIncomingTransportMessageMutatorsBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\MessageMutator\ApplyIncomingTransportMessageMutatorsBehavior.cs:line 22
   at NServiceBus.UnitOfWork.UnitOfWorkBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\UnitOfWork\UnitOfWorkBehavior.cs:line 24
   ---End of inner stack trace---

Exception:System.AggregateException
Message:One or more errors occurred.

   at NServiceBus.UnitOfWork.UnitOfWorkBehavior.AppendEndExceptionsAndRethrow(Exception parentException) in c:\NServiceBus\src\NServiceBus.Core\UnitOfWork\UnitOfWorkBehavior.cs:line 55
   at NServiceBus.UnitOfWork.UnitOfWorkBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\UnitOfWork\UnitOfWorkBehavior.cs:line 33
   at NServiceBus.Unicast.Behaviors.ForwardBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Unicast\Behaviors\ForwardBehavior.cs:line 24
   at NServiceBus.Audit.AuditBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Audit\AuditBehavior.cs:line 17
   at NServiceBus.Unicast.Behaviors.ImpersonateSenderBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Unicast\Behaviors\ImpersonateSenderBehavior.cs:line 22
   at NServiceBus.Unicast.Behaviors.MessageHandlingLoggingBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Unicast\Behaviors\MessageHandlingLoggingBehavior.cs:line 23
   at NServiceBus.Unicast.Behaviors.ChildContainerBehavior.Invoke(ReceivePhysicalMessageContext context, Action next) in c:\NServiceBus\src\NServiceBus.Core\Unicast\Behaviors\ChildContainerBehavior.cs:line 17
   at NServiceBus.Pipeline.PipelineExecutor.InvokeReceivePhysicalMessagePipeline() in c:\NServiceBus\src\NServiceBus.Core\Pipeline\PipelineExecutor.cs:line 41
   at NServiceBus.Unicast.Transport.TransportReceiver.OnTransportMessageReceived(TransportMessage msg) in c:\NServiceBus\src\NServiceBus.Core\Unicast\Transport\TransportReceiver.cs:line 449
   ---End of inner stack trace---

Exception:NServiceBus.Unicast.Transport.OnTransportMessageReceivedException
Message:Exception thrown while invoking event handlers for OnTransportMessageReceived

   at NServiceBus.Unicast.Transport.TransportReceiver.ProcessMessage(TransportMessage message) in c:\NServiceBus\src\NServiceBus.Core\Unicast\Transport\TransportReceiver.cs:line 378
   at NServiceBus.Unicast.Transport.TransportReceiver.TryProcess(TransportMessage message) in c:\NServiceBus\src\NServiceBus.Core\Unicast\Transport\TransportReceiver.cs:line 260
   at NServiceBus.Transports.Msmq.MsmqDequeueStrategy.Action() in c:\NServiceBus\src\NServiceBus.Core\Transports\Msmq\MsmqDequeueStrategy.cs:line 238

After digging into the code I'm more convinced than ever that whatever gain you might get from this cannot possibly outweigh the risk that this code fails in edge cases. All it does is remove some lines of text. If it fails you are almost completely unable to fix production bugs. There is just no way the gain can make the risk acceptable

mlidbom commented 9 years ago

If you want to have a look at what I've done so far the code is here: https://github.com/ManpowerNordic/NServiceBus/tree/complete-stack-traces

It's a work in progress. Chief among the problems is that I've been unable to successfully run many of the tests, with the original code as well as with my changes. It's probably due to environment setup of some sort that is missing. I'd appreciate it if you could point me in the direction of some information about how to get the test suite working correctly.

warrenrumak commented 9 years ago

@mlidbom thanks for stepping up to dig into this, I've been pulled into some other projects so I haven't had the time to come back to this until now.

mlidbom commented 9 years ago

@warrenrumak You're welcome :)

I'm not being altruistic here though. We have an important release coming up and 90% of all fault messages had no stack trace after upgrading to 4.6.3. Had I not made this fork we would have been in deep trouble. I just really hope that the NServiceBus people opt for stability and safety here so we wont have to maintain a fork.

SimonCropp commented 9 years ago

@mlidbom @warrenrumak what do u guys think of this as an approach? https://github.com/Particular/NServiceBus/pull/2362

mlidbom commented 9 years ago

@SimonCropp Looks mostly good :) Needs some work though. I left more detailed comments on the pull request.

SimonCropp commented 9 years ago

fixed by https://github.com/Particular/NServiceBus/commit/d414f4966152b49988a1a33a9406a3541c1fbad0

SimonCropp commented 9 years ago

@warrenrumak @mlidbom if you guys have time can u have a look at the current state of https://github.com/Particular/NServiceBus/tree/release-4.7.0 specifically https://github.com/Particular/NServiceBus/commit/cc65bd35175e2eddbb53e94c7f977325226b56f6 and https://github.com/Particular/NServiceBus/commit/d414f4966152b49988a1a33a9406a3541c1fbad0

plan to release shortly

mlidbom commented 9 years ago

I intended to find the time to take a closer look but have failed to find the time. One thing does concern me. Seems you just call ToString() on the exception now. I'm almost certain that this will result in incomplete stacktraces in some situations. I'm absolutely sure that it results in harder to read stacktraces than a decent formatting function.

I do not have the time to verify this at the moment, but the way I remember it I specifically changed from the tostring implementation to custom code for the formatting of the exception in our fork because information was missing in the stack traces in the messages on the error queue. The formatting was a bonus, I switched the implementation because we were missing vital information.

SimonCropp commented 9 years ago

@mlidbom

I'm absolutely sure that it results in harder to read stacktraces than a decent formatting function.

Tested this is a few scenarios and it was easy to interpret the results. Feel free to provide some examples where you think it will not work

As for exceptions that override ToString and do not include the stacktrace of the current or inner exception. Yes this is not ideal. But it is preferable to the custom renderer approach.

The custom renderer method is more problematic since

Basically it boils down to a custom render needs to do exception type checking to properly render an exception hierarchy. Yes we can code in if AggregateException and walk its children but NSB cannot cater any other unknown exception types. The fact that we could code in a special case AggregateException simply highlights the problem.

Yes there are edge cases that this is not optimal. But all the solutions that were looked into had similar, or worse, edge cases.

mlidbom commented 9 years ago

Those are some really good arguments. You've convinced me :) Thanks for taking the time to look into it.

SimonCropp commented 9 years ago

@mlidbom thanks for taking the time to help fix a bug