DATAVIEW is a big data workflow management system. It uses Dropbox as the data cloud and Amazon EC2 as the compute cloud. Current research focuses on the security and privacy aspects of DATAVIEW as well as performance and cost optimization for running workflows in clouds.
11
stars
5
forks
source link
Paper Submission FGCS: Deadline-Constrained Big Data Workflow Scheduling in the Cloud: the LPOD Algorithm #15
This paper extends our earlier work with the following additional contributions:
All the parameters for the LPOD algorithm, such as the average execution time of a task and the average transfer time of a data product, were previously configured manually.
Such configuration is now automated with the collection of provenance information, which is used for the estimate of these parameters. -- Create a workflow configuration file with the workflow name. (If we change the input data for a specific workflow, we will assume the workflow is changed)
Describe the architecture, algorithms, and implementation of the workflow executor Beta, which takes a workflow schedule produced by LPOD and execute the schedule in Amazon EC2 with the support of its associated task executor.
Additional experiments have been conducted to demonstrate the performance of the LPOD algorithm, workflow executor Beta and its associated task executor.
This paper extends our earlier work with the following additional contributions:
All the parameters for the LPOD algorithm, such as the average execution time of a task and the average transfer time of a data product, were previously configured manually. Such configuration is now automated with the collection of provenance information, which is used for the estimate of these parameters. -- Create a workflow configuration file with the workflow name. (If we change the input data for a specific workflow, we will assume the workflow is changed)
Describe the architecture, algorithms, and implementation of the workflow executor Beta, which takes a workflow schedule produced by LPOD and execute the schedule in Amazon EC2 with the support of its associated task executor.
Additional experiments have been conducted to demonstrate the performance of the LPOD algorithm, workflow executor Beta and its associated task executor.