Particular / NServiceBus

Build, version, and monitor better microservices with the most powerful service platform for .NET
https://particular.net/nservicebus/
Other
2.1k stars 647 forks source link

Unhandled exceptions and deadlocks when disposing the bus with an external container #4144

Closed cremor closed 8 years ago

cremor commented 8 years ago

NServiceBus version 5.2.16

According to the container documentation page we shouldn't dispose the bus instance manually but instead dispose the container. Quote:

Do NOT call IBus.Dispose when using an external container, instead call dispose in the container during shutdown.

But disposing the container seems to have various - sometimes big - problems.

I've tested all available container samples as of today. In each sample I've removed the using block around the bus and added a call to container.Dispose() (or equivalent) as the last line of code. This are the results:

I don't know if that note from the documentation page is still valid for NServiceBus 6 since the bus isn't injected in the container any more, but out of curiosity I also tried the Unity sample for NServiceBus 6 (Unity is the container I actually want to use). In that sample I replaced endpointInstance.Stop() with container.Dispose().

This resulted in an InvalidOperationException thrown in ExpiredTimeoutsPoller.Dispose().

danielmarbach commented 8 years ago

@cremor Thanks for testing this so thoroughly! Amazing!

Will be talking this through with @Particular/container-maintainers and give you feedback.

timbussmann commented 8 years ago

in addition to what daniel said, the v6 documentation (http://docs.particular.net/nservicebus/containers/?version=Core_6) does not contain that statement about disposing. It is always required to stop the endpoint before cleaning up (maybe we need to clarify this on the linked docs page?).

danielmarbach commented 8 years ago

@Particular/nservicebus-maintainers as far as I see it the problem is the following

  1. Manual container.Dipose
  2. Singleton UnicastBus is disposed by container
  3. UnicastBus.InnerShutdown shutdown transport receivers and strategy (which dispose some fields in the stop process)
  4. Builder.Dispose
  5. Container.Dispose (which then disposes the lifetime scope that is owned by the root container that is getting disposed

Since we register many internally managed types on the user container as well we are now essentially double. That might be fine in the case when we call IBus.Dispose since internally Fody takes care of injecting

    public void Dispose()
    {
      if (Interlocked.Exchange(ref this.disposeSignaled, 1) != 0)
        return;

but not when the container is disposed which then in that synchronous call tries to dispose the IContainer again. If I recall correctly we had several discussions about whether we should dispose the container or not. My opinion is we should not since we cannot guarantee that we always own the container.

I think we should update the doco (ship has sailed for v6, not worth patching in v5) and address this as our larger roadmap of removing the container from the core.

Thoughts on this @Particular/nservicebus-maintainers @Particular/container-maintainers ?

WilliamBZA commented 8 years ago

I think I'm missing a piece of the story here.

What does it mean if a container is disposed? Is that the end of the system completely? Or is a new container going to be instantiated with different configurations so the system as a whole can behave differently?

In summary: Why would we dispose the container, and what is the expected reason and response of that?

cremor commented 8 years ago

In my case it is the end of the process, e.g. the OnStop method of an windows service. I think that's also what the current documentation means because the wording mentions "shutdown".

danielmarbach commented 8 years ago

@cremor We had an internal discussion involving @timbussmann @WilliamBZA @andreasohlund and me. We came to the following conclusion:

We will fix in all the v6 targeted container adapters the way we dispose the containers. The following behavior will be applied: When we own the container, indicated when the user calls UseContainer() then we will dispose the container. If the user passes us a container instance, indicated when the user calls UseContainer(instance) then we will NOT dispose the container since the user owns it.

For v5 we will investigate into applying the same behavior change for the v5 targeted container adapters (starting first with Unity and Ninject) to see if that fixes the issue. We will keep you posted here.

danielmarbach commented 8 years ago

Aligned all v6 related PRs.

danielmarbach commented 8 years ago

Documentation change PR is added https://github.com/Particular/docs.particular.net/pull/1955

danielmarbach commented 8 years ago

Container releases vNext RC have the behavior change in there. We are now doing hotfixes of all previous versions.

danielmarbach commented 8 years ago

All containers except Spring are out. @WilliamBZA is finishing the spring work and I'll release it when done

danielmarbach commented 8 years ago

All containers have been updated and released. Announcement are out in google group and twitter. Closing this.

cremor commented 8 years ago

Are you sure that this is indeed fixed? I just download a fresh Unity sample which already references NServiceBus 5.2.17 and NServiceBus.Unity 6.2.1. But the same change to the sample code as described in the first post of this issue still results in a hang at shutdown.

I also suggest that all the container samples are updated to reflect the recommended usage mentioned on https://docs.particular.net/nservicebus/containers/#using-an-existing-container-cleanup

danielmarbach commented 8 years ago

@cremor your other suggestion is tracked https://github.com/Particular/docs.particular.net/issues/2002

I'll double check. I smoke tested a few containers but probably not Unity.

danielmarbach commented 8 years ago

@cremor I know now what went wrong. I think I misunderstood you slightly here on this issue. I sincerely apologize. With the fixes introduced I fixed the potential double dispose issue and improved how we handle externally passed in containers. But we haven't fixed the fact that the correct way of disposing is:

bus.Dispose();
container.Dispose();

If you leave out the bus.Dispose() we will run into the following issues:

Because we register internal infrastructure into the container (I know big mistake and we are slooooowly moving away from that) what can happen is that the TransportReceiver or the TimeoutPersisterReceiver is disposed first and then the unicast bus. Fody janitor injects a throw ObjectDisposedException into all members and when the Unicast bus calls Stop on the transport receiver it blows up.

regarding unity:

We have there another problem. It seems the UnicastBus is not tracked with the properly lifetime manager, changing the sample you mentioned to include

container.RegisterInstance(bus, new ContainerControlledLifetimeManager());

fixed it. Someone needs to call dispose in order to properly stop the bus. If it isn't stop it will go on forever and eventually hang.

But still I think our only proper way out is to document the dispose sequence. Sorry again.

ramonsmits commented 8 years ago

I just had a similar issue. I could make it work for Unity on core V5 but not with Windsor on either core V5 or V6. V6 has the additional issue that the endpoint instance does not implement IDisposable.

So yes, a workaround is doing bus.Dispose() before disposing the container or in V6 to call await endpointInstance.Stop().ConfigureAwait(false);.

That now means that you need to track all endpoints created separately per container while that should be done by the container.

danielmarbach commented 8 years ago

@ramonsmits for v5 it is not a workaround, it is expected like described here https://docs.particular.net/nservicebus/containers/?version=Core_5#using-an-existing-container-cleanup

In v6 you have to stop the bus separately correct before you dispose the container. Maybe it makes sense to mention that on the documentation page here? https://docs.particular.net/nservicebus/containers/?#using-an-existing-container-cleanup

That now means that you need to track all endpoints created separately per container while that should be done by the container.

Ok let me explain it again. The endpoint is created and owned by the composition root which also owns the container instance. It is the responsibility of the composition root to start and stop the endpoint. If it hosts multiple endpoints then it is the responsibility, again of the composition root, to start and stop multiple endpoints. It is free to decide wether those endpoints should be started sequentially or concurrently. This reponsibility cannot be implemented by the container. The container is no silver bullet and will never be.