revaturelabs / biforce

Biforce is a project conducted by Revature to improve its business decisions via re-examination of existing metrics and investigation into new metrics that will increase value of company assets. The goal is to leverage all relevant technologies to automate the process of data analysis within the business intelligence life cycle conducted on different departments within the company. The objective is to implement efficient algorithms for data processing via tools available within the Hadoop ecosystem that will run on a physical and cloud cluster.
10 stars 9 forks source link

Oozie #100

Open malyq opened 5 years ago

malyq commented 5 years ago

Automate the entire process from Caliber to Redshift, beginning with transferring the data to S3 and connecting to EMR.

luiginomp commented 5 years ago

Commit ef544b06b117534661ded19f09fdce197f917486

Oozie workflows have been reworked. Library now includes Setup and Execution sections. Setup is run once from development environment to save sqoop jobs in sqoop metastore. Execution workflows must be migrated into EMR.

Current iteration still missing biforce-execute-imports.xml workflow and biforce-execute.properties file. This file should have subworkflow actions to call hive-execute-import.xml and warehouse-execute-import.xml.