Particular / NServiceBus.Transport.AzureServiceBus

Azure Service Bus transport
Other
22 stars 19 forks source link

Add support for cross-topic subscriptions. #511

Open marcselis opened 2 years ago

marcselis commented 2 years ago

We would like to have it possible for an endpoint to subscribe to events that are published in another topic than the topic the endpoint is using.

Motivation: We do not have a dedicated NSB Operations team in our company, as we have a number of independent devops squads, that monitor their own NServiceBus endpoints. As our business is dealing with highly sensitive (personal & wages) data and ServiceControl & ServicePulse are lacking fine grained security which could limit what endpoints & (failed)messages a user can see and act upon, we can not use a single ServiceControl, Audit & Monitor set-up to monitor all NSB endpoints. (See also ServicePulse issue #453) At the moment we standardized on SQL Transport, and each of our squads have a dedicated schema in a single central NSB database, and in each schema, a dedicated ServiceControl, Audit & Monitor instance is installed that the squad uses to monitor its own endpoints. Cross squad communication is easy in SQL Transport:

We now have a number of new applications that are hosted in Azure that want to use the Azure Service Bus transport. So we would like to replicate our current SQL set-up to Azure Service Bus. We discussed this with Particular Support (see support case #00063703) and together we came up with a single Azure ServiceBus namespace with a separate topic for each of the squads, again having its own ServiceControl, Audit & Monitoring instance for each squad to monitor its own endpoints. This set-up works more or less:

All of our squads use a in-house built framework to configure NSB and in there we managed to write code to create cross-topic subscribtions directly in ASB using the ASB client that is referenced in the NSB ASB Transport, but this is not very sustainable. For example, in the latest version of the NSB ASB Transport, you switched from the Microsoft.Azure.ServiceBus package to Azure.Messaging.ServiceBus, which broke our code when upgrading to the latest version of the ASB Transport.

But more important: we are also trying to set-up a router to allow communication between the on-premise SQL transport and the ASB transport and there we are totally stuck:

SzymonPobiega commented 2 years ago

Hi @marcselis

If I understand correctly it is possible to achieve your goal with the current version of the router. Here is a draft PR that modifies the SQL Switch sample to route between ASB topics within the same namespace and a SQL transport. It uses a single instance of the Router with three interfaces.

marcselis commented 2 years ago

ok, but that still leaves the problem that an endpoint in an ASB namespace cannot subscribe to messages published by another endpoint in the same ASB namespace but using another topic to publish its messages. Technically this is perfectly possible, as in the ASB portal there is no "link" between a queue and a topic. You just create a subscription in on a topic and configure it to forward to any queue you want.

SzymonPobiega commented 2 years ago

ok, but that still leaves the problem that an endpoint in an ASB namespace cannot subscribe to messages published by another endpoint in the same ASB namespace but using another topic to publish its messages.

That's what that sample solves, via the router.

marcselis commented 2 years ago

To me that feels like using a canon to shoot a fly. As I explained in the router issue, we currently have 3 or 4 topics in the same namespace, but that number will organically grow to 20+ in the next few years. Everytime a new topic is added, we would need to change, test & redeploy our router to support routing messages between the new and all other topics.
All to get something to work that already works when using ASB directly: If I go to the Azure portal and manually create a new subscription in topic A and point it to the receiving queue of an endpoint that is using topic B, it works. No router required. It would be really nice if the NSB ASB transport could support that out of the box, so that we don't need to create the subscriptions manually and use the router for what it does best: routing between different transports. I'm a huge fan of NServiceBus, but I'm having a hard time convincing my colleagues to use it, instead of directly using the ASB API as they currently do. They only see what it can't do, and not (yet) the huge benefits they get in return...

SzymonPobiega commented 2 years ago

I see your point. I looked at the command line tool for the ASB transport and it seems that it would be possible to use it achieve your goal.

The subscribe command implemented in this class allows to pass the name of the topic. With this method you can subscribe an endpoint to an event published on a different topic that endpoint's own topic.

We recommend to always script the subscriptions for production deployment and not rely on the auto-subscribe capability so that should not add any complexity to your deployment. The reason for this recommendations is the unpredictable nature of auto-subscribe.

marcselis commented 2 years ago

Thanks, that will definitely help.

I've read the recommendation of not relying on the auto-subscribe & auto queue creation capabilities, but so far we have been using it for the deployment of all of our endpoints without any problems. I can relate to the fact that it makes things more explicit and is more secure as your endpoints don't need admin rights to create tables or queues. But it complicates the deploy a lot in the sense that in order to deploy an endpoint correctly, you also need to know if the endpoint is processing new event types, because then you also also need to create the new subscriptions. Failing to do so won't crash your endpoint, it will just not receive those new events. And that is much harder to detect! It is very difficult to find out what events have been missed and get them republished without any consequences for other subscribers that did process them. That is the main reason why we still rely on the auto* capabilities.

SzymonPobiega commented 2 years ago

@marcselis I understand. I labeled it as a candidate for future enhancement release. That does not mean that we'll handle it in the very next minor release -- this is still a subject for prioritization done by the team that works on the release.

Regarding the missing events, I think this topic is also very interesting. If I understand you correctly, the missing thing is a mechanism that would ensure that every subscriber that needs to process the event is subscribed when the publisher starts publishing the events. Let me validate if I understand the scenario correctly.

Suppose there is an endpoint that processes DoSomething commands and updates it database. Let's call it A. At some point in time the team responsible for the endpoint A decides that it grew to big and wants to split the responsibility for the DoSomething command into three parts:

From the outside the work that needs to be done did not change, it has just been split into three parts.

So in that case when the change is going to be deployed, I need:

Now, if the second step (subscription) failed silently then either B or C or both will skip some events and it will be very hard to publish them retroactively.

Does my description reflect well the reality of the problems you are facing?

marcselis commented 2 years ago

That is a good example, but it doesn't have to be that complicated.

Let me give you another example: We need to send a lot of declarations to the government. That is done by uploading xml files to a dedicated sftp location. The government then processes these files and drops a response in another sftp location where we need to fetch it. We have created a central process to send the files and receives all responses. That central process downloads all files that are in our inbox folder and publishes an event for each. We have different subscribers that examine the file (name and contents) to check whether the response is for them. (Sometimes 1 response file needs to be processed by multiple processes).

Suppose we have a service running that due to changed legislation suddenly needs to send a declaration to the government and process its response. We do this by deploying a new version that will now create the xml file on a central place and send a command to the central service to upload that file to the government. But of course it also needs to subscribe to the events that get published by the central component that indicate that a file was downloaded.

We receive 100K files per day from the government. So you can imagine the mess finding out the exact events that were missed and resend those to that single endpoint that needed them.

IMO it would be best to split the auto-subscription mechanism in 2 parts:

  1. detection of eventhandlers and missing subscriptions for them and
  2. creation of the missing subscriptions.

Part 1 should always run and fail if it detects a missing subscription and part 2 is disabled. That would make the newly deployed endpoint crash when the pipeline didn't create the needed subscription, before it got the chance of sending out a file, and no events are lost.

ramonsmits commented 2 months ago

Topic per event topology

A very simple variation would be to use a topic per event. Yes, this has some potential issues like not supporting inheritance and the number of topics is limited but likely not an issue for most customers.

That is also solvable by grouping events into a single topic. For example, per assembly which could represent all events published by a service boundary or a specific component. It could also just use a "correlation filter" instead of the "SQL filter" and result in improved performance.

Strict validation on not allowing inheritance

NServiceBus could for example have a default validation to only allow events to be published that are

Message examples:

// ✅
public sealed record MyEvent : IPrivate
{
}

// ❌
public record MyBase
{
}

// ❌
public sealed record MyOtherEvent : MyBase
{
}

Potentially users could opt-out of the validation with a "trust me I know what I'm doing" type of method.

Topic grouping

Grouping events on the same topic based on a attribute like

[NServiceBus.Transport.AzureServiceBus.Topic("Sales")]
public sealed record MyEvent : IPrivate
{
}

or convention:

// Use assembly name
transport.TopicProvider = (type) => type.Assembly..GetName().Name;
// .. or namespace
transport.TopicProvider = (type) => type.Namespace;

Could even allow for multiple topics for the same message although it likely would not be recommended

[NServiceBus.Transport.AzureServiceBus.Topic("Sales")]
[NServiceBus.Transport.AzureServiceBus.Topic("Finance")]
public sealed record MyEvent : IPrivate
{
}

or convention:

// Use assembly name
transport.Advanced.MultiTopicProvider = (type) => new []{ type.Assembly.GetName().Name, type.Namespace };