SzymonPobiega / NServiceBus.Router

Cross-transport, cross-site and possibly cross-cloud router component for NServiceBus
MIT License
5 stars 10 forks source link

ASB - Rule too long exception silently absorbed #31

Closed gamblen closed 4 years ago

gamblen commented 4 years ago

Hi, I am trying to connect a MSMQ subscriber to an Azure Service Bus Publisher. I created a NServicebus.Router to connect the two, but I had forgotten to add a Rule Name shortener to it. Once I had added that to the Router it worked fine. I did spend an awful lot of time trying to find the problem. The ASB Publisher throws an exception on startup so it is obvious what the problem is, but that exception is getting absorbed by the Router.

I have forked the repo and added a test to show the problem.

Would it be possible to get the Router to throw the error at startup?

SzymonPobiega commented 4 years ago

@gamblen Thanks for reporting it. I'll look into it over the weekend. Sorry for the time you lost 💔

SzymonPobiega commented 4 years ago

Hi

I attempted to re-created this condition by changing the When_publishing_from_asb_endpoint_oriented to use an event with a very long name. The Publisher endpoint is on ASB. The subscription message from the publisher got to the router and failed to be processed there. After running the default immediate and delayed retries, finally the Router switched to the "throttled" mode and logged the reason. Can you check if you get the error message in the router logs?

gamblen commented 4 years ago

Hi @SzymonPobiega On my fork, I checked the retry policy, even set a custom policy of no retries and it still fails with the timeout.
I can see in the log that it is trying again and again, but it never bubbles out to the call.

I have the application that I found the problem in, if I remove the rule name shortener, it fails. I have set the poison queue name and now see the error.

That is super then thanks. Slightly different behavior to the ASB Endpoint, which will crash if this happens, but this sends it back to the MSMQ error.

SzymonPobiega commented 4 years ago

The main issue here is that with the Router I had to get back to the message-driven pub/sub pattern used by NServiceBus for MSMQ.

The MSMQ endpoint needs to somehow notify the Router that it wants to receive events of type MyEvent. It does so by sending a subscribe message to the router and the Router subscribes to that event on the ASB side. That subscription process fails but the router does not recognize this failure as a configuration problem because it is transport-agnostic.

Router's failure handling it designed with an assumption that such poison messages are very unlikely (compared to broker being unavailable) so it really tries hard to push that message through.

One solution to this could be to add some code to the router that recognizes certain processing failures but that would be tricky to do as that code would be transport-specific.

Another solution that I am more inclined to implement in future is centralized routing in which the subscriptions are managed by a separate service that publishes this information. This way that central service may enforce the naming rules etc. But that would be a separate project ;-)

gamblen commented 4 years ago

Really appreciate that you have done work. Thank you.