Kappa Architecture - Githubissues

iluwatar / java-design-patterns

Design patterns implemented in Java

https://java-design-patterns.com

Other

88.73k stars 26.29k forks source link

Kappa Architecture #892

Open ranjeet-floyd opened 5 years ago

ranjeet-floyd commented 5 years ago

Description: The Kappa Architecture is a data processing architecture that provides a simplified approach to handling both real-time and batch data processing. Unlike the Lambda Architecture, which requires separate paths for batch and real-time processing, the Kappa Architecture uses a single stream processing engine for both real-time and historical data processing.

Main elements of the Kappa Design Pattern include:

Single Data Pipeline: A unified data processing stream that handles both real-time and historical data.
Stream Processing Engine: A core component that processes incoming data in real-time.
Immutable Data Store: All data is stored in an immutable, append-only log, ensuring data integrity and enabling easy reprocessing.
Reprocessing Capability: The ability to reprocess historical data by replaying the log.

References:

Acceptance Criteria:

Implement a single data pipeline that processes both real-time and historical data.
Integrate a stream processing engine to handle incoming data in real-time.
Ensure that the data store is immutable and supports reprocessing of historical data by replaying the log.

ranjeet-floyd commented 5 years ago

@iluwatar Can i take this ?. Thanks

iluwatar commented 5 years ago

Go ahead @ranjeet-floyd

iluwatar commented 4 years ago

The issue is free to take

Azureyjt commented 4 years ago

@iluwatar May I work on this issue?

iluwatar commented 4 years ago

Ok @Azureyjt

Azureyjt commented 4 years ago

Ok @Azureyjt

Thanks iluwatar. And I'm afraid that I need to study the background first since I'm not very familiar with big data system and streaming process. So it may take a little bit more time.

iluwatar commented 4 years ago

No problem. Thanks for looking into this.

Anurag870 commented 4 years ago

@iluwatar I would like to take it up. Please assign this to me.

iluwatar commented 4 years ago

Great @Anurag870, it's done

Anurag870 commented 4 years ago

@iluwatar Kappa Architecture requires log-based storage such as Kafka or Pulsar and a stream processing engine such as Spark or Flink. Having a basic running Java example would mean setting up these systems as well.

Even writing UML diagrams for it may not be very suitable.

I was thinking if I can add draw.io image and xml for this along with some description. If you feel it is off our scope, we can close the issue as well.

iluwatar commented 4 years ago

The approach we used in Hexagonal Architecture was to have multiple implementations of data store. There was an in-memory data store and then with some additional configuration you could switch to Mongo database.

Anurag870 commented 4 years ago

Sure will look at that

iluwatar commented 4 years ago

@Anurag870 still working on this?

iluwatar commented 2 years ago

The issue is unassigned again