aws / dotnet

GitHub home for .NET development on AWS
https://aws.amazon.com/developer/language/net/
Apache License 2.0
544 stars 98 forks source link

Design Doc: AWS Messaging Framework #42

Closed ashovlin closed 2 weeks ago

ashovlin commented 1 year ago

The AWS .NET team is exploring creating an AWS native framework that simplifies development of .NET message processing applications using AWS services.

The design doc can be viewed and commented on in PR #41 (rendered view).

The purpose of the framework would be to reduce the amount of boiler-plate code developers need to write. The primary responsibilities of the proposed framework are:

Here is an example showing a sample publisher and handler for a hypothetical OrderInfo message.

Sample publisher:

[ApiController]
[Route("[controller]")]
public class OrderController : ControllerBase
{
    // See later in the design for how this was configured and mapped to the queue
    private readonly IMessagePublisher _publisher;

    public OrderController(IMessagePublisher publisher)
    {
        _publisher = publisher;
    }

    [HttpPost]
    public async Task Post([FromBody] OrderInfo orderInfo)
    {
        // Add internal metadata to the OrderInfo object 
        // we received, or any other business logic
        orderInfo.OrderTime = DateTime.UtcNow;
        orderInfo.OrderStatus = OrderStatus.Recieved;

        // The updated OrderInfo object will also be serialized as the SQS message
        await _publisher.PublishAsync(orderInfo);
    }
}

Sample handler:

// See later in the design for how this was configured and mapped to the queue
public class OrderInfoHandler : IMessageHandler<OrderInfo>
{
    public async Task<MessageStatus> HandleAsync(MessageEnvelope<OrderReceived> message, CancellationToken cancellationToken = default(CancellationToken))
    {
        // Here we're reading from the message within the metadata envelope
        var productId = message.Message.ProductId;

        // Here we can do our business logic based on what is in the message
        await UpdateInventory(productId);
        await PrintShippingLabel(productId, message.Message.CustomerId);

        // Indicate that OrderInfo has been processed successfully
        return MessageStatus.Success;
    }
}

On either this issue or on specific section(s) of the design in the PR #41, please comment with:

Kralizek commented 1 year ago

Would it be possible to add a way to read/write message headers so that we can keep the messages relevant to the business logic and use the headers for anything that it's not BL?

ashovlin commented 1 year ago

Would it be possible to add a way to read/write message headers so that we can keep the messages relevant to the business logic and use the headers for anything that it's not BL?

@Kralizek - I'm curious, do you have an example of a piece of data that you'd want to append to the headers but not in the actual message?

But I do think it would be possible to append either arbitrary keys or perhaps a dedicated Metadata or Headers section in the message envelope that would be available to publishers/subscribers outside of the .NET type that is serialized as the message.

Kralizek commented 1 year ago

I'm curious, do you have an example of a piece of data that you'd want to append to the headers but not in the actual message?

Things like: correlation, session and/or transaction ID, authentication tokens, trace headers

commonsensesoftware commented 1 year ago

Is there some reason not to use Dapr? Its entire purpose is to provide common messaging infrastructure in a cloud vendor-neutral manner. AWS can already be leveraged for the cloud services backing it. Improving that story might be better than starting anew. Furthermore, there are Dapr SDKs for .NET, Python, Java, JavaScript, Go, and PHP.

normj commented 1 year ago

@commonsensesoftware Dapr is a good choice for users aiming to have a vendor-neutral solution. It does come with extra complexity running a side car container and since it is vendor-neutral it ignores some of the vendor specific features like FIFO queues for SQS and message attributes.

What we want to create is a library that is more lightweight and can be easily used in any compute environment whether that is virtual machines, containers or Lambda by just including a NuGet package. It won't be the right pick for users wanting a vendor-neutral solutions but things like Dapr and MassTransit are great for those requirements. From our research a large percentage of users are working with SQS and SNS without those vendor-neutral abstractions and end up creating their own lightweight abstraction. We want help remove that undifferentiating work we see being done.

jeffhollan commented 1 year ago

I agree Dapr has some additional overhead (sidecar) and even language-agnostic for abstracting these. I'm all for this, would be interested if after "proving" with AWS it could be proposed as a general messaging abstraction for any cloud messaging service (Google PubSub, Azure Service Bus, etc) in .NET as think this pattern could be useful across cloud services. But don't want the team to have to get all clouds to align before doing anything, so I think starting with this with an eye towards a possible contribution / proof point upstream would be cool

adamhathcock commented 1 year ago

I use MassTransit with SQS/SNS. There's a lot of configuration and code around it. Having a similar but more AWS native experience would be nice.

I don't use a lot of the more robust features of MassTransit (Sagas, Outbox, etc) so I could switch if there's a good foundation for SQS/SNS. Also, Kinesis support would be great.

Haven't looked into Eventbridge but I probably should if there's a framework around it.

jonnermut commented 1 year ago

I think this looks like a good initiative that would make SQS/SNS for messaging more approachable.

We have built something similar for our specific messaging architecture:

We might have used MassTransit, but it's too opinionated, and it imposes its own envelope format, making it only compatible with other MassTransit services. We have a very heterogeneous environment with all sorts of non .Net producers and consumers, which is why we chose CloudEvents as a standards based envelope.

I notice in the PR that you are making up a new message envelope format. Please consider Cloud Events, or making the envelope format and/or deserialisation pluggable somehow.

Similarly please consider making the routing logic as pluggable/overridable/flexible as possible to for instance support our use case of processing messages by Hangfire. Maybe that's just actually a generic IMessageHandler that creates the Hangfire task.

Observability (in the monitoring/logging/tracing sense) is really important!

We need to be able to set message properties when publishing to SNS in order to be able to filter messages at the subscription level.

One last thing to consider for the design - most real services will be running multiple servers/pods. Which is probably fine if all instances are sitting there polling SQS and each one grabs messages from the queue, processes it, and deletes it from the queue. But not fine if you expect every server to get every message. That's not relevant to our use case where we pump messages into Hangfire tasks which are then handled by N hangfire servers, but it's something important to cover off in the design.

SamuelCox commented 1 year ago

Would like to echo the request for CloudEvents

normj commented 1 year ago

@SamuelCox What are you using CloudWatch Events that EventBridge doesn't do for you. EventBridge is meant to be a superset of CloudWatch Events

SamuelCox commented 1 year ago

@SamuelCox What are you using CloudWatch Events that EventBridge doesn't do for you. EventBridge is meant to be a superset of CloudWatch Events

I should have been more clear. I was asking for built in support for the cloudevents standard, https://cloudevents.io/, nothing to do with cloudwatch events

bjorg commented 1 year ago

I just want to heed caution on AWS Kinesis. Unlike the other services, Kinesis semantics are quite different. It's a stream, which means, no next records can be read until the current one is successfully processed (just like in a file stream). The other message services are not as order sensitive. When all goes well, they may look the same, but when things go wrong, they act very differently.

bjorg commented 1 year ago

If SNS is used to delivery to SQS, would this abstraction hide that as well? In short, it would unbox the SQS wrapper and also the SNS wrapper before deserializing?

I assume that batched SQS messages would allow for partial failures, correct? So, if 9 out of 10 batch SQS messages were successfully processed, only the failed one would become visible again (instead of all 10).

iancooper commented 1 year ago

So, as the owner of Brighter: https://github.com/BrighterCommand/Brighter which also operates in this space a few thoughts.

[RequestLogging(0, HandlerTiming.Before)]
[UsePolicyAsync(step:1, policy: Policies.Retry.EXPONENTIAL_RETRYPOLICYASYNC)]
public override async Task<AddPerson> HandleAsync(AddPerson addPerson, CancellationToken cancellationToken = default(CancellationToken))
{
    await _uow.Database.InsertAsync<Person>(new Person(addPerson.Name));

    return await base.HandleAsync(addPerson, cancellationToken);
}

which is very similar to your pitch, so it's not really a USP for you.

A project like Brighter is only going to go so deep on AWS integration, the real benefit to the .NET ecosystem would be if you went deeper than we could.

lee-11 commented 1 year ago

I really hope this isn't another "make the easy stuff easier" endeavor. There are difficult bits that others have mentioned. Implementing messaging systems that "fail well" rather than "fail poorly" is the real challenge.

normj commented 1 year ago

@bjorg

I just want to heed caution on AWS Kinesis. Unlike the other services, Kinesis semantics are quite different. It's a stream, which means, no next records can be read until the current one is successfully processed (just like in a file stream). The other message services are not as order sensitive. When all goes well, they may look the same, but when things go wrong, they act very differently.

I'm skeptical that Kinesis fits well in this library as well with the same concerns as you have about it being a stream but I'm not ready to completely close the door completely.

If SNS is used to delivery to SQS, would this abstraction hide that as well? In short, it would unbox the SQS wrapper and also the SNS wrapper before deserializing?

I assume that batched SQS messages would allow for partial failures, correct? So, if 9 out of 10 batch SQS messages were successfully processed, only the failed one would become visible again (instead of all 10).

Yes the library would take care of unwrapping the SNS envelope. I also want to make sure the serialization/deserialization is extensible so an advanced user could register their own serialization implementation in the DI for the library to use. Yes on handling partial failures.

normj commented 1 year ago

@iancooper I really don't want to compete with all of the great projects out there like Brighter. If anything we should be helping those project where they need AWS help fitting AWS services into those abstractions.

But as you say all those libraries are trying to treat all the message brokers the same. And that is fine for many users. In our conversations with .NET developers using AWS a large percentage of them don't need/want the generic abstraction because they are all in on AWS and want an easier way to use AWS services but not remove any of the capabilities of the service. That is where I see this library fitting in and that is what we are seeing developers implement themselves over and over again.

The Handler code snippets was an easy quick views of the experience but like you said that is I'm sure common across all of these libraries. I think what I should do is take another pass through the design doc to emphasize how you can still get access to the AWS advanced features like handling message group id and dedulication ids for FIFO. Making sure how to pass additional SNS or SQS attributes.

normj commented 1 year ago

@lee-11

I really hope this isn't another "make the easy stuff easier" endeavor. There are difficult bits that others have mentioned. Implementing messaging systems that "fail well" rather than "fail poorly" is the real challenge.

I totally agree that focusing on failing well and fault tolerance is the critical part needed for this library and arguably the biggest amount of work to get right. That is where we could use as much feedback as possible on what people want to happen when things don't go as expected.

normj commented 1 year ago

@SamuelCox That is great feedback about looking into how cloudevents fits into this library. Thanks!

iancooper commented 1 year ago

@iancooper I really don't want to compete with all of the great projects out there like Brighter. If anything we should be helping those project where they need AWS help fitting AWS services into those abstractions.

I know, and a rich ecosystem of choices is good.

The Handler code snippets was an easy quick views of the experience but like you said that is I'm sure common across all of these libraries. I think what I should do is take another pass through the design doc to emphasize how you can still get access to the AWS advanced features like handling message group id and dedulication ids for FIFO. Making sure how to pass additional SNS or SQS attributes.

Yeah, that is the sweet spot IMO

dev-ayaz commented 1 year ago

Would it be possible to add a way to read/write message headers so that we can keep the messages relevant to the business logic and use the headers for anything that it's not BL?

You can use message attributes for this purpose

birojnayak commented 1 year ago

We are doing something similar for Web services which are still using SOAP (HTML Design and open source PR) , this would enable any queue transport (SQS, Amazon MQ, Rabbit MQ , MSMQ etc) with any cloud providers, basically a more generic transport layer (architecture PR for reference). So thought of sharing few challenges which we are solving may be useful what we are building here,

  1. How to maintain ordering of messages, so that when service executing a logic another message from the queue is not picked and processed if FIFO (considering all api/logic execution is async). How should we ensure both from producer side as and consumer side.
  2. How to provide extensibility point back to developers rather than we take the decision, so that they can opt for their own way of serving the notification (SNS, Lambda, CW, Event Bridge etc). Giving them full message visibility and status of business logic execution so that they can decide whether to put into DL queue or not.
  3. How to provide extension so that developers can provide their own encoding and decoding mechanism what goes inside the queue and our logic can honor that. What if they want to send message in chunk ?
  4. There could be one consumer with multiple producers, what can we do to keep context of each producers and make it visible to consumer for full isolation, how to handle security context in those cases.
iancooper commented 1 year ago
1. How to maintain ordering of messages, so that when service executing a logic another message from the queue is not picked and processed if FIFO (considering all api/logic execution is async). How should we ensure  both from producer side as and consumer side.

Brighter uses a single-threaded message pump (scale via competing consumers and more pumps), and does not recommend an async pipeline when order is important. Not de-ordering is a challenge though if you want to use ideas like an outbox. It can be a challenge for consumers that simply offload a message to a message from the thread pool (don't do that it won't scale unless you can apply backpressure).

2. How to provide extensibility point back to developers  rather than we take the decision, so that they can opt for their own way of serving the notification (SNS, Lambda, CW, Event Bridge etc). Giving them full message visibility and status of business logic execution so that they can decide whether to put into DL queue or not.

We tend to offer specific exceptions that you can throw, some folks don't like use exceptions for that, which would mean you need to use return code values.

3. How to provide extension so that developers can provide their own encoding and decoding mechanism what goes inside the queue and our logic can honor that. What if they want to send message in chunk ?

Brighter (and I think Just Saying) offer you the ability to register a 'mapper' that maps between the wire message body and your internal types. Brighter treats that as a byte array and is agnostic to you wanting to use a given encoding (proto-buf, avro etc). Within the AWS space though you might be more able to assume most folks will use JSON as you are layering over an HTTP API in most contexts.

4. There could be one consumer with multiple producers, what can we do to keep context of each producers and make it visible to consumer for full isolation, how to handle security context in those cases.

Normally you would need to use the headers for this I suspect, though I don't know if I understand exactly what you are asking.

alexeyzimarev commented 1 year ago

MassTransit already supports RMQ (Amazon MQ) and SQS. Why not support an open-source project with nearly 38 million downloads on NuGet, and build additional transports (or Riders) for it instead of building something completely new?

noahtrilling commented 1 year ago

I would also like to encourage this team to consider making contributions to MassTransit. It already has wide adoption in the .NET space. Our organization is running MassTransit over RabbitMQ currently. In our organization, I know we are much more likely to adopt a new transport on our existing messaging framework than we are to adopt a new messaging framework and be forced into a new transport. Adding MassTransit support for EventBridge would make me much more likely to encourage its adoption in my organization. Knowing that the AWS .NET developers are actively engaged in contributing and maintaining SNS and SQS features in MassTransit would make me significantly more likely to adopt those services as well.

The design document mentions some 'technical constraints' you'd prefer to avoid in using a third party library. Could you enumerate those? MassTransit's architecture has proven extremely flexible to the addition of new transports and Riders and I'm confident the MassTransit community would happily welcome your contributions.

I believe that many other organizations will feel as I do, that if AWS contributes to vendor neutral frameworks, I'll be much more likely to adopt AWS transports, because AWS will have some skin in the game. If not, I'll stick with both vendor neutral transports AND a vendor neutral framework to avoid locking. I believe AWS and the .NET community will see substantially more return on much more limited investment by encouraging contribution to existing OSS Messaging Frameworks, particularly MassTransit.

danielmarbach commented 1 year ago

For full transparency I want to mention I work for Particular Software the makers of NServiceBus. That being out of the way I want to share my personal opinion here that comes based on my history of contributing to several "SDKs" and abstractions including NServiceBus and MassTransit.

I believe the AWS team could make a much bigger impact with their limited resources by addressing two things in the current SDK:

For example the Azure SDK for EventHubs and Service Bus provide "lower level" primitives that require you to manually manage messages/events, settlement, lock renewal and more. But if you don't want to worry about these things you can opt-in using higher level primitives that for example allow setting the concurrency, auto-lock renewal and more and you simply get called by the SDK when a message is available. In this mode, you don't have to worry about proper concurrency, cancellation token handling, behind the scenes renewal threads etc. All that is neatly packaged into a slightly higher level abstraction.

Having something like that available helps developers out there to get started quicker with the services they want to use without interfering too much with already available offerings. In fact, based on my own experience, I can say having those primitives available in the Azure SDK makes things like receiving messages, batching and more implementations for the "abstraction" NServiceBus so much easier.

mgmccarthy commented 1 year ago

Would it be possible to add a way to read/write message headers so that we can keep the messages relevant to the business logic and use the headers for anything that it's not BL?

NServiceBus (a messaging framework for .NET) has the concept of headers (https://docs.particular.net/nservicebus/messaging/headers). Super useful for handling things like infra, cross cutting concerns, correlation id's, etc.

Having a construct wrapped over to of MessageAttributes would be very useful from an AWS messaging framework.

embano1 commented 1 year ago

NServiceBus (a messaging framework for .NET) has the concept of headers

Since CloudEvents was also mentioned in this thread (events as a sub-class of messages), just making a plug here for its transport bindings which make heavy use of protocol headers (e.g. HTTP, Kafka, RabbitMQ, etc.) and content-type hints to project metadata and keep business logic (domain objects/payload) free from these, i.e. HTTP body is payload only.

ashovlin commented 1 year ago

Hello all, thank you for the feedback so far! We've just pushed a revision to the design document with:

nazarenom commented 1 year ago

Hello, for one of our future projects, we're looking into SQS/SNS for backend interactions. It would be handy to deal with a higher-level framework, like the one proposed in this issue, rather than with the SDKs API.

Is there any news you could share or any ETA?

normj commented 1 year ago

@nazarenom Thanks for your interest. We don't have an ETA right now but please let us know if you have any requests/requirements that are not covered in the design doc. We are just starting the initial code construction so a great time to make sure we have the ground work for future features.

nazarenom commented 1 year ago

Thanks for the reply, @normj We looked at the design documentation, which looks promising. For the vast majority of the things we plan to do, it's what we need. Do you have any plan/idea about supporting retries capabilities or outbox-like features similar to what the mentioned community frameworks do?

Thanks again. Regards.

mgmccarthy commented 1 year ago

Would really love to see a Saga implementation as part of this effort. NServiceBus, MassTransit, etc... all have very robust Saga implementations tied directly into the framework. While I'm aware of AWS Step Functions, they lie outside the bus architecture that is being proposed, and I'm not wild about either the visual designer, or the json that represents the Step Function workflow. Other .NET based service bus's allow you specific a Saga using code.

jeroenbai commented 1 year ago

Always good to have options and to make AWS services more easy to use.

The way I currently use SQS: AWS Lamda function with API gateway to receive webhooks (just a simple node.js Lambda function), and then use the webhook call headers/payload to publish messages to SQS using AWS.SQS / sendMessage. Then, a .NET / C# application reads messages off the queue and cleans up using AmazonSQSClient / ReceiveMessageRequest / DeleteMessageBatchRequest.

normj commented 1 year ago

We have made our repository where we are doing our development public. The work is still very much in progress.

https://github.com/awslabs/aws-dotnet-messaging

Kralizek commented 1 year ago

Would it be possible to have an overview of the upcoming features? maybe using GitHub projects?

normj commented 1 year ago

@Kralizek We kind of have a crud version on the README file here. https://github.com/awslabs/aws-dotnet-messaging#project-status

We don't use GitHub projects because it doesn't work well with our sprint planning that covers many public and internal projects. We have tried it a few times and for all of the things our team owns it ends up getting out of date and forgotten.

For features you want us to think about adding I recommend opening a GitHub issue.

ashovlin commented 2 weeks ago

Thanks again for all of the feedback on the design!

We launched the developer preview version of the library in March 2024.

Please follow along with the development and leave comments/issues over on https://github.com/awslabs/aws-dotnet-messaging moving forward.