desaiyang / DevOps

some details about DevOps and it is associated technologies ... browse thru ...
0 stars 0 forks source link

Data Pipeline #26

Open desaiyang opened 1 year ago

desaiyang commented 1 year ago

https://www.snowflake.com/guides/data-pipeline

desaiyang commented 1 year ago

A data pipeline is a means of moving data from one place (the source) to a destination (such as a data warehouse). Along the way, data is transformed and optimized, arriving in a state that can be analyzed and used to develop business insights.

A data pipeline essentially is the steps involved in aggregating, organizing, and moving data. Modern data pipelines automate many of the manual steps involved in transforming and optimizing continuous data loads. Typically, this includes loading raw data into a staging table for interim storage and then changing it before ultimately inserting it into the destination reporting tables.

desaiyang commented 1 year ago

ELEMENTS Data pipelines consist of three essential elements: a source or sources, processing steps, and a destination.

  1. Sources Sources are where data comes from. Common sources include relational database management systems like MySQL, CRMs such as Salesforce and HubSpot, ERPs like SAP and Oracle, social media management tools, and even IoT device sensors.

  2. Processing steps In general, data is extracted data from sources, manipulated and changed according to business needs, and then deposited it at its destination. Common processing steps include transformation, augmentation, filtering, grouping, and aggregation.

  3. Destination A destination is where the data arrives at the end of its processing, typically a data lake or data warehouse for analysis.

desaiyang commented 1 year ago

image

desaiyang commented 1 year ago

DEVOPS FOR DATA APPS ON SNOWFLAKE

devops-for-data-apps-on-snowflake.pdf