angelofgdsantos / authors-project

Code from research assistant work for Professor Chinhui Juhn at University of Houston Economics Department
0 stars 0 forks source link

Move Authors Project to Cloud #11

Open jordanholbrook opened 9 months ago

jordanholbrook commented 9 months ago

https://azure.microsoft.com/en-us/free/students https://www.databricks.com/university https://github.com/great-expectations/great_expectations

https://aws.amazon.com/about-aws/whats-new/2015/05/aws-educate-students-and-educators-can-access-aws-technology-cloud-courses-training-and-collaboration-tools/

https://towardsdatascience.com/the-easiest-way-to-run-python-in-google-cloud-illustrated-d307c9e1651c

jordanholbrook commented 9 months ago

Three Tiers of Data Processing:

Ingest Raw Files --> Bronze Basic Data Validations (remove missings, etc) --> Silver Valid Added Data Processing, Add new variables --> Gold

Need to create a SQL database:

Authors table Books table Books genre table Authors Demographics table

Once data is normalized into database - merges become a trival