Closed isaacabraham closed 9 years ago
Good question. Every job that will be executed is indeed going to be single threaded, however there is no guarantee that a worker won't be running other jobs simultaneously. Ideally, a worker should be executing one job at a time and that job should be maxing out all cores of the worker. In practice, this may not happen: users may be tempted to pipe 1000 workflows to Cloud.Parallel
. The addition of Tasks to the programming model also brings in the potential of using lightweight jobs, such as creating actors.
So I think allowing collocated jobs is useful in general. Unfortunately, the Azure runtime currently uses an interim solution, in which the max number of concurrent jobs can only be set by the configuration of the worker role. We need to improve on this. One way to do it is to have instances dequeue jobs dynamically, based on current CPU/network load. Perhaps we could use machine learning techniques to somehow detect that cloud { return 1 + 1 }
is a trivial job that can be dequeued in scores. Suggestions welcome.
@eiriktsarpalis Currently in Brisk I'm setting the max jobs per worker as number of cores but given our discussion yesterday re: performance am considering increasing this to e.g. cores * 8 for the moment. Either way - it's probably misleading to suggest that it's single threaded - indeed in performance tests that example and the (more complicated) parallel threaded one actually came out roughly the same time.
True, but I think we should actively discourage users from passing arbitrarily many workflows into Cloud.Parallel
. I tend to believe that increasing the number of concurrent jobs per worker will merely amplify an illusion, that it makes no difference whether we group inputs together or schedule distinct jobs for each and every one of them.
BTW, the latest version includes an MBrace.Workflows
namespace that contains a number of useful utilities. Among them is the Distributed.map : ('T -> Cloud<'S>) -> seq<'T> -> Cloud<'S []>
combinator, which works as a drop-in replacement for the Cloud.Parallel
primitive. This will balance inputs across workers/cores and should offer improved performance regardless of size.
Should we amend the existing demos then if going forwards that's the idiomatic way to parallelise simple seq<'T>
workloads?
I already did change a few with my last PR. I left the 'hello world' sample because I found it instructive (I think it returns the job id, so every output is different)
Is it worth leaving Cloud.Parallel in the API then? I'm a fan of having a single way to do a thing - this is really important for getting new people on board quickly as well.
I think that Cloud.Parallel is important for educational reasons to be in the introductory fsx files. Of course it is our job (as runtime implementors) to optimise the Cloud.Parallel primitive.
I think so, yes. It is the essential primitive over which everything else is being built on. In the end, the MBrace user wanting to fully optimize their particular algorithm may end up directly using the primitive. This may complicate the programming model, but then again nobody said distributed computation was easy.
@palladin I beg to differ. I think that the power of the primitives lies in their unambiguous treatment of granularity. If a runtime implementer decided to overrule this, it would overturn assumptions made by library designers. Not being able to reason about the underlying flow of execution is bound to take a toll on performance gains.
I'm not referring to the granularity of the Cloud.Parallel primitive but to this https://github.com/mbraceproject/MBrace.Azure/issues/19 , to optimise the unit cost of each cloud block.
Ok, agreed.
Closing this as we've moved these examples out of the mainline teaching sequence
@eiriktsarpalis @eiriktsarpalis @palladin - please close (I'm not an admin here)
In exercise 3, there's an example
clusterMultiWorkerSingleThreaded
. Is this really single threaded? How do we guarantee that only one task / job will be carried out at a time on any worker?