Open ehiggs opened 10 years ago
The mpi service should be "callable" from the outside somehow to coordinate the stopping/restarting of some or all hadoop services. The same can the be reused to restart said services.
Forced stopping can already be done via
force_fn = os.path.join(self.controldir, 'force_stop')
similar "primitive" control can be used to stop and wait and to restart without reconfigure.
Currently, we don't allow users to set the Hadoop configuration options on startup. This is covered by issue #1 . Because we don't allow that, users should be able to login to nodes and take down the cluster, change settings, and start it back up. This fails for two reasons:
$HADOOP_CONF_DIR/slaves
file is missing. This is merely a list of hostnames for the nodes running slave tasks. This should probably exist.stop-mapred.sh
andstart-mapred.sh
work is by sshing to each of the slaves and taking down the tasks (JobTracker, TaskTracker). However, when we ssh into each node, we lose our environment and thus the job loses track of where the hadoop scripts are ($HADOOP_HOME
isn't found).Either we need to find a way to setup the environment so these scripts work or we should provide our own scripts which do the same thing and let users bounce tasks on their own.