kfrankc / thenextbestbook

📚Books Recommendation Website for DATA 515A - Spring 2019
2 stars 3 forks source link

Automate ETL process with dummy data #4

Closed tharunsikhinam closed 5 years ago

tharunsikhinam commented 5 years ago

1) store a subset of raw data in the repo 2) run amazon and goodreads scripts to generate cleaned data and dump to mongodb 3) test cases in spark

tharunsikhinam commented 5 years ago

data folder added and also added test data