WSU-CPTS415-ParquetParkour / Amazon-CoPurchasing

GNU General Public License v3.0
0 stars 0 forks source link

[TASK] ETL performance monitoring #6

Closed Jarrus00 closed 2 years ago

Jarrus00 commented 2 years ago

Background: Since the SNAP dataset is approximately 1GB, we will need to be mindful of the loading/processing time needed to import the dataset into the database.

Problem: We need a way of tracking the performance of the ETL to identify any inefficiencies early on so we are not using excess CPU cycles or human time.

Success Criteria:

Jarrus00 commented 2 years ago

An MVP implementation has been created in the dev_jarrus branch.