pombreda / appscale

Automatically exported from code.google.com/p/appscale
0 stars 0 forks source link

Allow ZooKeeper to run and store critical system information #163

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Currently whichever machine runs the "shadow" role is a single point of
failure in the system. To begin remedying this, Yoshi has proposed
integrating support for ZooKeeper and store critical system metadata in it.
Specifically, the following changes need to be made:

Need to write a server exposing a REST / SOAP interface that provides an
interface to ZooKeeper running on the same machine. It will expose a
minimal amount of methods that relate to the system's metadata. Currently,
we believe the following methods are sufficient:

- getAllBoxes(): Returns a list of the machines that should be up and their
roles.
- getLiveBoxes(): Returns a list of the machines that are actually up and
their roles (these two methods can tell us which machines are dead or
unresponsive)
- getApps(): Returns a list of the applications that should be running
- getUserData(user): Returns the data associated with user 'user'.
- getAppData(app): Returns the data associated with application 'app'. This
contains data on where the app should be running as well.

Need to discuss if other functions are needed.

Blocked on (137) right now since we don't want to run ZooKeeper everywhere;
only on critical nodes or on nodes outside of AppScale cloud. Thus we wish
to specify in our config file where to run it and only run it there.

Need to discuss how this would work and what this entails for Jonathan and
Yoshi. It will also require a scratch install that automated ZooKeeper
installation only since that will be run on boxes outside of AppScale, and
need to take into consideration that ZooKeeper must run for HBase, and how
to ensure they don't conflict.

Original issue reported on code.google.com by shattere...@gmail.com on 29 Jan 2010 at 7:33

GoogleCodeExporter commented 9 years ago
Need the following:

1) zookeeper_scratch_install.sh - Installs ZooKeeper and the AppController from
scratch on the given box.

2) Change AppController's djinn.rb to add a helper method start_zookeeper() 
that runs
ZooKeeper. This method will use the list of boxes running ZooKeeper to write 
the ZK
config and start ZK on this box if its role is ZK. This functionality must be 
able to
run as a non-root user.

3) Add a ZooKeeperHelper that provides an interface to ZK. Preferred methods are
get(key), put(key, val), and delete(key). Need to decide what the semantics are 
of
these calls (e.g., what does get return when the key doesn't exist?)

4) Change appscale-add-keypair to read YAML keys as user@ip instead of just IP. 
If
user is not specified, assume it is root. When ssh calls are made here, change 
it to
use user@ip instead of root@ip.

5) Change appscale-run-instances to scp over needed files as user@ip instead of 
just IP.

6) Change appscale-terminate-instances to scp over needed files as user@ip 
instead of
just IP.

Recommendations for incremental progress are as follows:

- Pull the latest code from trunk, which has support for advanced layouts. Note 
that
in the simple layout, every node should be running ZK (will need to change
tools/lib/node_layout.rb accordingly). Be sure to commit to a separate branch 
until
code review is done.

Comments welcome.

Original comment by shattere...@gmail.com on 4 Feb 2010 at 10:13

GoogleCodeExporter commented 9 years ago
Need to have this working for the next release, since transaction support may 
rely on it.

Original comment by shattere...@gmail.com on 1 Apr 2010 at 8:34

GoogleCodeExporter commented 9 years ago

Original comment by shattere...@gmail.com on 22 Sep 2010 at 4:16