ChakshuGautam / cQube-ingestion

cQube Ingestion Blocks
MIT License
5 stars 44 forks source link

[POC] CDC using Estuary.dev #159

Open tushar5526 opened 1 year ago

tushar5526 commented 1 year ago

Problem statement

Currently end users have to manually generate csv files for different events going in a program and ingest them in cqube. There is a scope of automation here if we can capture change events in database and generate the program, event files directly using a config file.

Solution Archtiecture

Screenshot 2023-08-06 at 11 26 56 AM

We are using Estuary as a CDC provider here. A data pipeline is created which sends every event on the database to a webhook. The webhook accepts the data, parses it, generates related CSV files, and stores it in a S3-like storage (for POC we are storing on the local system only). A cron job is running which ingests data into CQUBE after some interval of time that is configured.

The ingest files generated are structured as follows:

This POC is backward compatible as well - we can add it to the already existing c-qube deployments with ease.

tushar5526 commented 1 year ago

CDC service - https://github.com/tushar5526/cqube-cdc-poc CDC branch - https://github.com/tushar5526/cQube-POCs/tree/cdc-demo