hhm970 / mental-health-monitor-project

0 stars 0 forks source link

Plan out ETL process #7

Open hhm970 opened 5 months ago

hhm970 commented 5 months ago

Description

Our user input into the database won't be all tickboxes; some of the entries will involve text (eg. the user's emotions). We will need to clean the inputs on user emotions.

Possible obstacles include:

Required Files

./pipeline

User Story

As an engineer, I need to ensure that no values repeat themselves in the database, so that the database remains in 3rd normal form.

hhm970 commented 5 months ago

Extraction: We will use Google Forms for the daily survey, where data will be stored in Google Sheets, and extracted via the Google Sheets API

Transform: Input data into pandas dataframe, performing any necessary data-cleaning in the process

Load: A meticulous process of ensuring all data goes into their corresponding table

hhm970 commented 1 month ago

Transform process needs hashing functionality, due to security and privacy issues with having the raw email address