dapr / dotnet-sdk

Dapr SDK for .NET
Apache License 2.0
1.11k stars 334 forks source link

Same application doesn't allow to process CloudEvent and raw messages #1294

Open fabistb opened 4 months ago

fabistb commented 4 months ago

Expected Behavior

The same application should be able using the [Topic] attribute to subscribe to CloudEvent and Raw messages. The background is that this would easily allow to migrate from the Servicebus Queue binding to the Servicebus Queue Pub/Sub component and also make it more flexible dealing with messages from non Dapr system.

There is also a unexpected behavior with the optional enableRawPayload property from the [Topic] attribute if the CloudEvent middleware has been removed.

On a different note I also haven't found a point in the docs that it isn't possible to process CloudEvents and Raw message with the same application.

Actual Behavior

It isn't possible to process Raw messages and CloudEvent messages with the same application since the behavior heavily depends on the Dapr CloudEvent middleware.

CloudEvent middleware enabled

CloudEvent middleware disabled

Steps to Reproduce the Problem

Simple Asp.Net Core test application to show the different behaviors. https://github.com/fabistb/dotnet-dapr-pub-sub-raw

Note

This issue has already been mentioned in the past:

Release Note

RELEASE NOTE: FIX Allow to process CloudEvent and Raw messages via Pub/Sub in the same application.

philliphoff commented 3 months ago

@fabistb I did some testing with your sample application (thanks for supplying that!). Here are some comments.

(CloudEvent middleware enabled) Raw message send - enableRawPayload true --> 415 Unsupported Media Type --> unexpected

Because the expected payload type is raw, the Dapr runtime treats it as an opaque object, sending a cloud event payload (content type application/cloudevents+json) to the application with the data content type application/octet-stream. The cloud event handler on the application side unwraps the payload and gives the controller the underlying binary stream, which then complains because it was expecting JSON. In this scenario, it would be the controller's responsibility to deserialize the unwrapped "raw" payload and, hence, the error is expected.

(CloudEvent middleware enabled) Raw message send - enableRawPayload false --> 415 Unsupported Media Type --> expected?

The application stated it does not expect raw payloads but a raw payload was sent anyway. In this case, the Dapr runtime appears to fallback to sending the raw payload as-is but still with the application/cloudevents+json content type. This then causes the cloud events handler to fail (as it's expecting a cloud events envelope). It may be that the Dapr runtime does this because it assumes that a raw payload sent to an application that does not expect one is, itself, already in a cloud events envelope.

(CloudEvent middleware disabled) Raw message send - enableRawPayload true --> 400 Bad Request --> unexpected?

The Dapr runtime sends the raw payload wrapped in a cloud events envelope (content type application/cloudevents+json). However, without the cloud events handler, the content is not unwrapped. ASP.NET falls back to treating the payload as JSON and tries to deserialize the body but cannot find the required properties.

(CloudEvent middleware disabled) Raw message send _enableRawPayload false --> 200 OK --> unexpected?

The Dapr runtime sense the raw payload as-is (i.e. not in a cloud events envelope, as it expects that's what the raw payload already is) with the application/cloudevents+json content type. ASP.NET falls back to treating the payload as JSON and deserializes the body.

In all of these cases, I believe Dapr (both runtime and SDK) are doing the expected thing, if not exactly the most intuitive thing. I think one of the key things to keep in mind is that the "raw payload" flags in the application are for the benefit of the Dapr runtime (i.e. how it should serve raw payloads to the application) and not necessarily for the benefit of the cloud events handling within the application. Even if a raw payload is expected, the general cloud events handling cannot know its ultimate type in order to fully unwrap it for processing by ASP.NET controllers. (I suppose it might be possible to add an optional "expected raw payload type" flag to topics to be used by the cloud events handler to force it to unwrap what otherwise is an opaque binary stream.)

In any case, it's certainly possible for an application to support both cloud event and raw messages in the same application, even when using the cloud events handler; the latter just requires the application to do the deserialization of the "raw" binary stream.

fabistb commented 2 months ago

@philliphoff , sorry for the late reply. I thought I already answered which wasn't the case.

Thanks for the detailed explanation.

Just to ensure that I understood that correctly. As soon as I set enableRawPayload = true the dapr runtime will automatically add the cloud event format in the background? Which just internally treated differently?

I think the main part here is that it really isn't intuitive and kind of blocks any "obvious" parts using raw and cloudevent messages simultaneously.

Especially in migration scenarios this could be interesting.

philliphoff commented 2 months ago

@fabistb I think the important bit is that Dapr pub-sub is fundamentally based on Cloud Events (for good or bad); that is, there is an expectation, by design, that all pub-sub messages delivered to applications by the Dapr sidecar have a Cloud Events envelope (and the SDK handlers assume the same). By default, the Dapr sidecar will take care of generating the envelope. However, when raw payload is indicated the burden shifts to the component to generate the envelope (in cases where the component type is already Cloud Event aware and to avoid "re-wrapping").

While you can process both Cloud Event and "raw" messages, it's going somewhat against the grain of the pub-sub design and, therefore, the application is going to have to do more work on its own to support it.

fabistb commented 2 months ago

@philliphoff thanks again! So the in case of any migration scenario the the burden of managing this relies on the App.

However I think this should be somewhat documented because in the documentation I was able to find it doesn't mentioned this at all. https://docs.dapr.io/developing-applications/building-blocks/pubsub/pubsub-raw/

So I got the assumption that CloudEvent is just removed the CloudEvent at all and kind of behaves like the Servicebus Queue binding just with the Pub/Sub api.

philliphoff commented 2 months ago

@fabistb I think it's a fair point that the docs could be better, as the topic has a lot of nuance in terms of whether the original message was published "raw" or not, how each component supports Cloud Events or not, and whether the subscriber expects "raw" messages or not. Every time I come back to this I have to try to remember how it's supposed to work (and much of that is just inferred from existing behavior, as I wasn't around for the original implementation.) Perhaps you could file an issue in the dapr/docs repo to clarify the expectations for pub-sub?

fabistb commented 2 months ago

@philliphoff , sure. I will create the issue.