Document process for resuming alternating solve runs

To conserve compute resources, we would like to first (optimisitcally) solve with a smaller number of alternating runs (e.g. 2) and then have the option to resume the process with additional runs later if necessary. I initially considered adding naming/suffix parameters to AlternatingRunParaemeters, but realized that things get messy trying to avoid naming conflicts with existing stacks since prior runs may or may not have kept intermediate stacks. I decided it was easier to leave the current process as-is and handle resumes by specifying the previous aligned stacks as source stacks. This will produce stacks with ugly aggregated names but it avoids conflicts and the stacks can be renamed later with a script if desired.

To resume a run, you specify the previous aligned result as a "raw" source and set shiftBlocks to true if there were an odd number of prior runs:

  "pipelineStackGroups": {
    "raw": {
      "projectPattern": "^cut_000_to_009$",
      "stackPattern": "^c..._s..._v01_align$"   // instead of "^c..._s..._v01$"
    },
    ...
  },
  "pipelineSteps": [
    "ALIGN_TILES"
  ],
  ...
  "affineBlockSolverSetup": {
    ...
    "blockPartition": {
      ...
      "shiftBlocks": false                      // set to true if odd number of prior runs
    },
    "alternatingRuns": {
      "nRuns": 2,
      "keepIntermediateStacks": false
    }
  },

Here are what the stack names might look like when resuming a run that had 2 prior runs:

first run:
  source: c009_s310_v01 ->
    c009_s310_v01_align_run1
    c009_s310_v01_align_run2 (or c009_s310_v01_align if keepIntermediateStacks false)

resumed run:
  source: c009_s310_v01_align_run2 ->
    c009_s310_v01_align_run2_align_run1
    c009_s310_v01_align_run2_align_run2 (or c009_s310_v01_align_run2_align if keepIntermediateStacks false)

Let me know if you think this approach is reasonable or if you think there is a better way.

Although I did not end up adding any code to support resuming alternating runs, I did improve the spark distribute affine solve parallelization and fixed a couple of bugs so there is a little related code to review :).

saalfeldlab / render

Document process for resuming alternating solve runs #172