p-society / gc-server

Stay updated in real-time and engage with the thrill of the game like never before.[WIP]
Apache License 2.0
3 stars 6 forks source link

[DISCUSSION] : Consideration of Using Kafka for Streaming Data to Clients #4

Closed zakhaev26 closed 4 months ago

zakhaev26 commented 7 months ago

The goal of the GCSB project is to develop a robust system with multiple independent and decoupled APIs for sports. Currently, Server-Sent Events (SSE) have been identified as a suitable choice for achieving real-time comm. from server-->client due to their lightweight nature and ease of setup,but there are certain concerns that needs to be solved.

Key Concerns:

  1. API Design Uniformity: How should we design the APIs to ensure uniformity across the project? Should we opt for a single SSE or individual SSE for each API ? We are aiming for a uniform approach that can enhance consistency and ease maintenance.

    • Single vs. Individual SSE: Trade-offs between having a single SSE for all APIs or individual SSE for each API.
    • Design Principles: We need to brainstorm on design principles to be followed for API endpoints, naming conventions, and response formats.
  2. Need for Kafka/Similar Queue or Pub-Sub Messaging:

    Question: Considering a maximum of 1000 concurrent users at worst case, do we really need a queuing or pub-sub architecture like Kafka/RabbitMQ? What are the pros and cons? Can SSE alone can handle the expected load or if a more scalable solution is necessary?

Please share your thoughts, concerns, and suggestions regarding the API design uniformity and the need for a queuing/pub-sub architecture in this issue thread.

Consider this as a High Priority Issue.

majorbruteforce commented 7 months ago

Kafka or other pub-sub services facilitate high throughput read and writes between 'systems'. They don't inherently support the client-serving server to deal with a large number of requests (they are probably not made for that purpose). Keeping the project dynamics in mind, we in fact don't have a large throughput to deal with; rather, we need to efficiently serve the data to a large volume of clients concurrently, reliably and in real-time. In my opinion, we should look into load balancing the servers and scaling them when require along with the usage of Redis to cache the data. We can build a system and test it using something like Apache JMeter. We should also test to first determine the degree of resilience our system requires so we don't overengineer it.

punitkr03 commented 7 months ago

@majorbruteforce I was having the same thought. We can work with caching as we do not need to have high data throughput. We will keep the cache in sync with the database in set intervals thus reducing complexity.

zakhaev26 commented 7 months ago

We have 2 choices as of now that cater to this need : M1. Using Database Changestreams to track changes and emit those via SSE Server M2. Having a Centralized Kafka Cluster having multiple kafka servers - Producers(admins) publishing messages to the partitions and consumers picking them up , process em up and emit via SSE.

Pros of M1 : Simple to setup Cons of M1: Scalability

Pros of M2: Reliable,Fault tolerant,scalable,can help in making unified system Cons of M2: Hard Learning curve, Maintainence overheads. Eg : A 2 sport pub-sub system: image

zakhaev26 commented 7 months ago

I beleive ~1k connections can be managed by both the Methods. According to me , we should prioritize building the system with MongoDB Changestreams as of now as scalability is not really our need. We might not want to overengineer and complicate things,although if there is a requirement,please tell @majorbruteforce @punitkr03 Your thoughts?

punitkr03 commented 7 months ago

Seems complicated. Let me do some research.

punitkr03 commented 7 months ago

@majorbruteforce @zakhaev26 I did some digging and according to me using kafka will ensure stability in the long run if the project scales up. So it can be redundant. Also we have much time to implement it. I am all in on kafka implementation.

zakhaev26 commented 7 months ago

I am also interested in using Kafka.. @majorbruteforce @Brijendra-Singh2003 ?

zakhaev26 commented 7 months ago

Did some benchmarks to test out reliability of Changestreams v/s Kafka These runs were performed on :

Test Scenario:

Outcome :

Functional Benchmarks :

Apache Kafka Avg Response Time : ~364ms with all iterations being a success Screenshot from 2023-12-21 04-42-39

Mongo Changestreams Avg Response Time: Trial 1: ~741ms. [Freezes & Fails after 8 Iterations ] Screenshot from 2023-12-21 04-45-42 Trial 1: ~693ms. [Freezes & Fails after 16 Iterations ] Screenshot from 2023-12-21 04-48-38

Performance Benchmarks :

10 VUs for 1min Fixed Load

Screenshot from 2023-12-21 05-42-00

100 VUs for 1min Fixed Load

Does this mean we shouldn't choose changestreams over kafka ? Nope As these are Admin Updates.I don't think there would be 100 Admins / even 2 Admins at a single session to Upload Scores in Server.In that case Both are doable

But after these tests kafka seems good

@majorbruteforce @punitkr03 @Brijendra-Singh2003

I definitely didn't feel like primeagen after performing these tests :p

Source : https://github.com/zakhaev26/microservices-go

zakhaev26 commented 7 months ago

P.S: I did try out the most popular Kafka Library for JS , but it was slower to interact with Kafka IMO, whereas the confluentinc-kafka library for golang felt way more faster, even on a single thread.

So even if we are planning to use kafka , we need to make sure it works fine / manageable with Node so that devs can work with JS as well Performance metrics alone don't tell the full story; practical integration and developer experience matters

majorbruteforce commented 7 months ago

I am also interested in using Kafka.. @majorbruteforce @Brijendra-Singh2003 ?

I am looking to create a load balanced system that caches the data using Redis. I will try to test how many SSE connections a server with standard specifications can handle.

zakhaev26 commented 7 months ago

What's the progress Jesse? @majorbruteforce 🕺

majorbruteforce commented 7 months ago

While building a two-layer system with a cache layer, I realized it makes no sense to use a cache for an application which has to update data constantly. Revalidating the cache so frequently is no better than broadcasting changes directly from change streams. I am trying to test a few mores ways like polling to see how they compare. I will run some benchmarks and start working on building the main APIs soon.

majorbruteforce commented 7 months ago

Also, @zakhaev26 try running the benchmarks for read operations once. I will do the same. The system is going to have bulk reads rather than writes.

zakhaev26 commented 7 months ago

The system is going to have bulk reads rather than writes.

Genuine

punitkr03 commented 7 months ago

A basic implementation of chess api architecture.Screenshot_20231226_161024_Samsung Notes.jpg

zakhaev26 commented 7 months ago

I used Sarama Library instead of Confluentinc one for interacting with Kafka,I felt it is a more reliable way of use producer + subscriber and is extremely fast Look into it if writing in Go : Sarama

majorbruteforce commented 4 months ago

Kafka is tried and the best option to proceed with for events streaming as tested by @zakhaev26. Discussions regarding implementation of the same will be done on #38 from hereon.