airbytehq / write-for-the-community

Contribute and collaborate on educational content for the Airbyte Community.
MIT License
42 stars 8 forks source link

Building a real-time data analytics with Airbyte, Apache Kafka, and Apache Pinot #145

Closed dunithd closed 2 years ago

dunithd commented 2 years ago

Submission Details

Having timely access to insights is crucial for data-driven decision making. Typically, it takes a significant amount of engineering effort to build and maintain a data pipeline that moves operational data into a real-time analytics platform like Apache Pinot. This tutorial explores how Airbyte is making it easy and accessible for a typical developer to build a streaming data pipeline to move operational data from a MySQL database to Apache Pinot.

Steps Airbyte related 1) Configure a source connector for MySQL 2) Configure destination/sync connector for Kaka 3) Make a connection between MySQL and Kafka

Pinot related 1) Configure a table definition to ingest from the Kafka topic that has data coming from Airbyte 2) Query the table using integrated UI 3) Invoke Pinot APIs with CURL to mimic an application making request for analytics

Inspiration https://medium.com/event-driven-utopia/building-reference-architectures-for-user-facing-analytics-dc11c7c89df3 https://medium.com/event-driven-utopia/building-a-low-latency-fitness-leaderboard-with-apache-pinot-40a4da672cf0

The Format

arimbr commented 2 years ago

Draft: https://docs.google.com/document/d/1QB6BcQGvhpLLIABX7BSSJ_vcbBKfziXV1N_ySM4LNVs/edit?usp=sharing