Building ETLs is quite repetitive and data source specific, therefore, we want to provide an off-the-shelf set of ETLs for as many of the data sources listed in Landscape4Data as possible. This will give the user an ability to automate data harvesting and existing data updates
Example
sudo -u mapred hadoop fs -mkdir example
Example
sudo -u mapred hadoop fs -mkdir example/lib
Example
sudo -u mapred hadoop fs -put [file_to_upload] example
All Java programs must be compiled into jars, and it is recommended that all their dependencies are packaged in the jar as well. Copy all jars to example/lib
Copy the hive configuration file hive-site.xml from /etc/hive/conf/hive-site.xml to the directory example in HDFS.
To run the Oozie workkflow, execute the following command as mapred user:
sudo -u mapred oozie job -oozie [oozie-server_url]/oozie -config [path to job.properties in local file system] -run
Example
sudo -u mapred oozie job -oozie http://udltest2.cs.ucl.ac.uk:11000/oozie -config [path to job.properties in local file system] -run