elastacloud / mbrace-on-brisk-starter

Contains a set of scripts and demos to get you up and running with MBrace on Brisk.
6 stars 6 forks source link

Is this really single threaded? #29

Closed isaacabraham closed 9 years ago

isaacabraham commented 9 years ago

In exercise 3, there's an example clusterMultiWorkerSingleThreaded. Is this really single threaded? How do we guarantee that only one task / job will be carried out at a time on any worker?

eiriktsarpalis commented 9 years ago

Good question. Every job that will be executed is indeed going to be single threaded, however there is no guarantee that a worker won't be running other jobs simultaneously. Ideally, a worker should be executing one job at a time and that job should be maxing out all cores of the worker. In practice, this may not happen: users may be tempted to pipe 1000 workflows to Cloud.Parallel. The addition of Tasks to the programming model also brings in the potential of using lightweight jobs, such as creating actors.

So I think allowing collocated jobs is useful in general. Unfortunately, the Azure runtime currently uses an interim solution, in which the max number of concurrent jobs can only be set by the configuration of the worker role. We need to improve on this. One way to do it is to have instances dequeue jobs dynamically, based on current CPU/network load. Perhaps we could use machine learning techniques to somehow detect that cloud { return 1 + 1 } is a trivial job that can be dequeued in scores. Suggestions welcome.

isaacabraham commented 9 years ago

@eiriktsarpalis Currently in Brisk I'm setting the max jobs per worker as number of cores but given our discussion yesterday re: performance am considering increasing this to e.g. cores * 8 for the moment. Either way - it's probably misleading to suggest that it's single threaded - indeed in performance tests that example and the (more complicated) parallel threaded one actually came out roughly the same time.

eiriktsarpalis commented 9 years ago

True, but I think we should actively discourage users from passing arbitrarily many workflows into Cloud.Parallel. I tend to believe that increasing the number of concurrent jobs per worker will merely amplify an illusion, that it makes no difference whether we group inputs together or schedule distinct jobs for each and every one of them.

BTW, the latest version includes an MBrace.Workflows namespace that contains a number of useful utilities. Among them is the Distributed.map : ('T -> Cloud<'S>) -> seq<'T> -> Cloud<'S []> combinator, which works as a drop-in replacement for the Cloud.Parallel primitive. This will balance inputs across workers/cores and should offer improved performance regardless of size.

isaacabraham commented 9 years ago

Should we amend the existing demos then if going forwards that's the idiomatic way to parallelise simple seq<'T> workloads?

eiriktsarpalis commented 9 years ago

I already did change a few with my last PR. I left the 'hello world' sample because I found it instructive (I think it returns the job id, so every output is different)

isaacabraham commented 9 years ago

Is it worth leaving Cloud.Parallel in the API then? I'm a fan of having a single way to do a thing - this is really important for getting new people on board quickly as well.

palladin commented 9 years ago

I think that Cloud.Parallel is important for educational reasons to be in the introductory fsx files. Of course it is our job (as runtime implementors) to optimise the Cloud.Parallel primitive.

eiriktsarpalis commented 9 years ago

I think so, yes. It is the essential primitive over which everything else is being built on. In the end, the MBrace user wanting to fully optimize their particular algorithm may end up directly using the primitive. This may complicate the programming model, but then again nobody said distributed computation was easy.

@palladin I beg to differ. I think that the power of the primitives lies in their unambiguous treatment of granularity. If a runtime implementer decided to overrule this, it would overturn assumptions made by library designers. Not being able to reason about the underlying flow of execution is bound to take a toll on performance gains.

palladin commented 9 years ago

I'm not referring to the granularity of the Cloud.Parallel primitive but to this https://github.com/mbraceproject/MBrace.Azure/issues/19 , to optimise the unit cost of each cloud block.

eiriktsarpalis commented 9 years ago

Ok, agreed.

dsyme commented 9 years ago

Closing this as we've moved these examples out of the mainline teaching sequence

dsyme commented 9 years ago

@eiriktsarpalis @eiriktsarpalis @palladin - please close (I'm not an admin here)