Tran-Ngoc-Bao / Process_Shopee_Data

Analyze e-commerce information on Shopee
2 stars 0 forks source link

Mini project - Data Engineering - Viettel Digital Talent 2024

Project introduction

Data flow

Deploy system

1. You should pull and build images in file docker-compose.yaml before

docker pull { ... }

2. Move to clone project and Start system

docker compose up -d

3. Build enviroment on airflow-webserve and airflow-scheduler

docker exec -u root -it [airflow-webserver/airflow-scheduler] bash 
source /opt/airflow/trino/build-env.sh

4. After start system, all port website of containers in here

5. Start DAG in Airflow cluster

6. Build enviroment Superset

./superset/bootstrap-superset.sh

7. Visualize data in Superset with SQLalchemy uri

trino://hive@trino:8080/iceberg

Output

Top well-rated products by item

Top liked products by item

Top products with the most comments

Report