Vertically flexible chunking of runs

andsor commented 8 years ago

@debsankha says

parallelizing an experiment fro multiple IC's differ from Parallelizing based on Parameter range. Generating IC's is not the issue I was talking about. Let's consider two scenarios:

Simulate a Kuramoto network for 1000 values of coupling K, with 100 initial conditions each, and plot the order parameter in order to obtain a phase transition plot.
Simulate another network for 20 values of coupling K, with 100000 initial conditions each, and count how many initial conditions went to which fixed point, and plot a basin volume plot.

Scenario 1 would best be tackled by parallelizing by chunking the K range. but Scenario 2 would better served by chucking the 1000000 initial conditions into smaller chunks.

My question was can we design pysimkernel so that it is very easy to perform both experiments?

andsor commented 8 years ago

https://pycnic.slack.com/archives/pysimkernel/p1455473396000010

andsor commented 8 years ago

What we need here is to flexiblize the job hierarchy. Previously I thought that all aggregation happens within a task/subjob. Now I see we should also allow aggregation across multiple tasks/subjobs. Having said that, the hierarchical concept of embarrassingly parallelization persists.

This entails the need to specify directly the number of ICs=runs, and the number of runs per task independently. Behind the scenes, a separate merge task would aggregate the results from all runs for one job (parameter set instance). This possibly means the resulting internal array is too large to fit in memory.

pycnic / pysimkernel

Vertically flexible chunking of runs #7