unbit / uwsgi

uWSGI application server container
http://projects.unbit.it/uwsgi
Other
3.45k stars 691 forks source link

[RFC] Legion + Emperor for balanced distribution of vassals #314

Open unbit opened 11 years ago

unbit commented 11 years ago

The scenario is N servers each one with an Emperor.

We have 100 vassals, and we want them to be distributed evenly between Emperors.

So if we have 2 Emperors we will get 50 vassals each one automatically.

The infrastructure could be:

0) - /etc/vassals is an nfs share 1) elect a lord 2) only the lord scans for vassal files in /etc/vassals 3) when a new instance has to be spawned the lord talk to the Emperor with less running vassals and force it to spawn the new one.

Missing pieces:

what happens if the lord dies ? all of the nodes should have a copy of the status.

Investigate if we can avoid modyfing the Emperor and relies only on an imperial monitor plugin.

Another topic: we add another vassal file if we want to scale the same app to multiple servers. We need to ensure the same instance does not run on the same hardware

unbit commented 11 years ago

UPDATE 1: The whole subsystem should support broodlord mode (read: ask for more instances).

As broodlord mode currently only support spawning zerg instances, we should extend it to force an external emperor to spawn a new vassal

unbit commented 11 years ago

CURRENT PROPOSED DESIGN (part1):

the whole structure will be in emperor_legion plugin (changing the core should not be needed)

The emperor is configured as:

uwsgi --legion foobar ... --emperor legion:foobar,/etc/uwsgi

you run this command on all of the nodes of your cluster

The lord (and only him) will start monitoring /etc/uwsgi (we suppose /etc/uwsgi is an nfs share)

When a new vassal has to be started, the lord send a message to all of the cluster members to know their "availability" for the specific vassal. Something like

-> node1, are you able to take /etc/uwsgi/vassal001.ini ? <- node1 [yes, i am already running 17 vassals] -> node2, are you able to take /etc/uwsgi/vassal001.ini ? <- node2 [no, i am already running 3 vassals] -> node3, are you able to take /etc/uwsgi/vassal001.ini ? <- node3 [yes, i am already running 5 vassals] ...

the node3 is able to take /etc/uwsgi/vassal001.ini and it is the node with the lowest number of running vassals

Finally the lord send something like

-> node3, i order you to spawn /etc/uwsgi/vassal001.ini <- node3, [ok]

Problem: what happen if node3 fails to start the vassal ? should we try the next available cluster node ?

BROODLORD:

The node running /etc/uwsgi/vassal001.ini receive an SOS for more resources from the vassal.

This message exchange will run:

-> lord, i need more resources for /etc/uwsgi/vassal001.ini (in background the message exchange described before happens again, the nodes already running /etc/uwsgi/vassal001.ini will announce they inhability to run it) <- lord [ok]

prymitive commented 11 years ago

I'm still trying to push my upaas - "PaaS for the poor" - project forward. It's a set of tools for managing apps running on uWSGI:

each application is described by yaml metadata that contains at least:

deployment looks like this:

Now we have application packed into a tar.gz containing everything we need to run it somewhere. System will take this package, unpack it on selected backend(s) and generate uWSGI emperor vassal file for it. Vassal will start, subscribe to FastRouter and we are done.

I've got stuck past few weeks but I'm getting back to developing it, so hopefully I will make it usable in few months.

This architecture is currently handling ~70 apps in the company I work for and devs are happy with it.