amazon-archives / aws-flow-ruby

ARCHIVED
136 stars 57 forks source link

Choosing Worker From Same Activity Type Nodes Based on System Usage #81

Closed pihish closed 9 years ago

pihish commented 9 years ago

I have four servers which are classified under the same activity type. All four servers are consistently polling from SWF. I start one workflow and one of the nodes start a processing routine. This routine will take an hour long and 80% of the CPU resources of the server.

How do I make sure that the next workflow I start does not utilize this same server? And so on for the third and fourth workflows I start? Is there any logic I can put in my decider to do this?

pmohan6 commented 9 years ago

Thanks for your question! Just to clarify, does each workflow (decider) start one activity or is one workflow responsible for starting all 4 activities one after another?

pihish commented 9 years ago

There is one decider node which is responsible for starting all activity nodes

pmohan6 commented 9 years ago

You can start all 4 activity workers on different task lists. Then, in your decider code, you can specify which tasklist you want to schedule the activity on. This will let you control which task goes to which host. You can even queue multiple tasks on each tasklist and restrict the number of tasks the activity worker can pick up at once.

pihish commented 9 years ago

I think it's much better to monitor the queue levels of each activity type than to have logic in the decider try to ascertain which node is free. Since SWF operates on a pull system, you can set logic inside of each activity worker to only pull when they have unused resources. It's much more simple from a complexity standpoint if you don't overload your decider with too much logic.

pmohan6 commented 9 years ago

When an ActivityWorker gets an ActivityTask from SWF, it spawns off a new process to work on it. You can control whether the worker forks a new process or not and the number of forks. The ActivityWorker will not poll SWF if it reaches the maximum number of processes allowed and will wait for atleast 1 slot to free up.

If you are directly initializing the ActivityWorker, you can set the option as follows -

AWS::Flow::ActivityWorker.new(client, domain, tasklist, klass) { { use_forking: false } }

If you are using the Runner, you can set the value in your json config -

...
  "activity_workers": [
    {
      "number_of_workers": 1,
      "number_of_forks_per_worker": 1
    }
  ]
...

Hope that helps!