To support a compute cluster on cloudmesh, we are offering a new design of the orchestration for clusters with big data. We will investigate OpenStack Heat, Chef, Puppet and Docker to see if their are some benefits of their design decisions and suggest the new approach for starting and configuring clusters such as Hadoop or SLURM on cloudmesh (i.e. OpenStack). With our new design and implementation of cluster manager, we will have a new command cluster to start, configure, manage or update compute nodes (vms) on cloudmesh. We identify the current issues so far:
Start virtual machines with a fixed number of nodes
scalable clusters can be configured manually only
Communication across multiple nodes via ssh by hand
updating authorized_keys and hosts files are done manually via script
Configuration master and worker nodes by hand
It is being done by chef cookbooks but there is no automated script.
Confirmation of proper working and configuration
A user need to verify each node working correctly as expected.
We offer new features in the new design and the implementation:
Easy start up vms with OpenStack Heat
pre-configured templates provide a simple way to launch vms with its initialization processes.
ssh-copy-id with OpenStack Heat
ssh authentication can be easily estabilished with OpenStack Heat SoftwareDevelopment (from Icehouse release)
in cloudmesh, we provide two individual templates for master and worker nodes so the initialization steps can be taken differently.
Verification of completed setup of a cluster
With defining expected results in cloudmesh, cluster command understands that the cluster is up and running properly. Otherwise, all relevant logs and messages will be reported to the user.
Our tentative plans for the command cluster:
Example 1. start 5 vms for hadoop cluster
cluster start hadoop --num=5hadoop is a template name in this command which contains the location of openstack heat template and templates for master and worker nodes.
Example 2. write a template for the master node of hadoop
cluster write hadoop master
#!/bin/bash
curl -L https://www.opscode.com/chef/install.sh | bash
wget http://github.com/opscode/chef-repo/tarball/master
tar -zxf master
mv opscode-chef-repo* /home/ubuntu/chef-repo
rm master
mkdir /home/ubuntu/chef-repo/.chef
echo "cookbook_path [ '/home/ubuntu/chef-repo/cookbooks' ]" > /home/ubuntu/chef-repo/.chef/knife.rb
knife cookbook site download java
knife cookbook site download apt
knife cookbook site download yum
knife cookbook site download hadoop
...
(ctrl + d or EOF to exit writing)
Example 3. define expected result after proper installation and configuration.
cluster success hadoop "service hadoop-hdfs-namenode status" "* Hadoop namenode is running"
"service hadoop-hdfs-namenode status" is a command to verify.
"* Hadoop namenode is running" is an expected result.
To support a compute cluster on cloudmesh, we are offering a new design of the orchestration for clusters with big data. We will investigate OpenStack Heat, Chef, Puppet and Docker to see if their are some benefits of their design decisions and suggest the new approach for starting and configuring clusters such as Hadoop or SLURM on cloudmesh (i.e. OpenStack). With our new design and implementation of cluster manager, we will have a new command
cluster
to start, configure, manage or update compute nodes (vms) on cloudmesh. We identify the current issues so far:We offer new features in the new design and the implementation:
Our tentative plans for the command
cluster
:Example 1. start 5 vms for hadoop cluster
cluster start hadoop --num=5
hadoop
is a template name in this command which contains the location of openstack heat template and templates for master and worker nodes.Example 2. write a template for the master node of hadoop
Example 3. define expected result after proper installation and configuration.
cluster success hadoop "service hadoop-hdfs-namenode status" "* Hadoop namenode is running"
"service hadoop-hdfs-namenode status" is a command to verify. "* Hadoop namenode is running" is an expected result.