spring-cloud / spring-cloud-stream

Framework for building Event-Driven Microservices
http://cloud.spring.io/spring-cloud-stream
Apache License 2.0
988 stars 603 forks source link

Support aggregation for the general case (not only Source/Sink/Processor) #450

Closed mbogoevici closed 5 years ago

mbogoevici commented 8 years ago

As a developer, I want to be able to aggregate applications with arbitrary i/o channel structure, so that I can address more potential use cases.

geoand commented 7 years ago

I am currently looking into Spring Cloud Stream (and Spring Cloud Data Flow) as a replacement for topologies that our organization is currently writing for Apache Storm (in which the programming model feels so tedious compared to the Spring programming model).

I have to say that I am thoroughly impressed by what I have seen so far from the project, but the feature that is described in this user story would be the the single feature I think is missing the most. IMHO the ability to create complex aggregated applications in which the input 'flows' through a graph-like path is important for all but the most trivial applications.

mbogoevici commented 7 years ago

@geoand Thanks for looking into Spring Cloud Stream and for the feedback.

Just trying to clarify this a little bit. For deploying complex distributed topologies, the orchestration and deployment (on a number of different platforms) is handled by Spring Cloud Data Flow - see http://cloud.spring.io/spring-cloud-dataflow/ for details.

Aggregation in the context of this story refers to merging together a number of independently deployable Spring Cloud Stream apps in a single aggregate that interacts 'in-process' - for reducing the overhead of a broker hop and serialization. Is that what you meant?

In any case, thanks and looking forward to more contributions and feedback!

geoand commented 7 years ago

@mbogoevici Yes that is exactly what I meant!

I understand that Spring Cloud Data Flow aims to create complex topologies where each component handles specific parts of a stream pipeline, but what I am looking forward to is the ability to route messages within the same application process, in order to reduce the chattiness on the broker that would incur if everything where handled in the Spring Cloud Data Flow fashion.

An example I have in mind is the following:

In the scenario above I would like to avoid having to communicate via the broker and would prefer everything to be in the same process.

I understand that there is very fine line between too much functionality in the application on one end, and too much chattiness on the other end. However I believe that this line should be up to the developer to decide :)

If there is anything else you would like to know, please let me know :)

mbogoevici commented 7 years ago

@geoand Great, thanks for the feedback! I don't think it should be very hard to amend our DSL to support something like that (also accounting for the pubsub nature of the intermediary channels). Any further contributions are welcome, too!

geoand commented 7 years ago

Sounds great!

If I come up with anything else, I'll of course share it you and the rest of the team.

Thank you for your time!

PedroAlvarado commented 7 years ago

The value of #449 is truly realized by being able to aggregate on the general case. On behalf of a team of seven, I ask that we move this one to the top of the priority list. See my comments on #449 for another use-case.

Moreover, there is a cost savings and performance story to this work. We have a use case where we will move 1.2 billion events per day through a pipeline. The benefits of batching, compression, data formatting, and other techniques can be dwarfed by the ability to remove a physical hop while keeping a logical in-jvm hop in the pipeline. This can reduce brokers costs by double digit percentages(50% in our case) while removing the need for other clusters(the hops removed). Users will remove as many hops as they can to save costs and gain performance anyways(we do). It would be great to further support their efforts through this feature while giving them the flexibility to continue to build reusable components/apps for other pipelines.

olegz commented 5 years ago

I believe this is already accomplished with Spring Cloud Function and its composition ability. Closing