broadinstitute / pooled-cell-painting-image-processing

BSD 3-Clause "New" or "Revised" License
3 stars 4 forks source link

Map handoffs and checkpoints for genome scale runs of image analysis workflow #1

Open bethac07 opened 4 years ago

bethac07 commented 4 years ago

(Moved here from https://github.com/broadinstitute/pooled-cell-painting-analysis/issues/83)

Right now, the workflow has several stages and/or proposed stages

  1. CellPainting illumination correction calculation
  2. CellPainting illumination correction
  3. CellPainting segmentation pipeline (see #81)
  4. CellPainting stitching and splitting into tiles
  5. Barcode calling illumination correction calculation
  6. Barcode illumination correction application (see #82) and alignment.
  7. Barcode color compensation and barcode calling "sanity check".
  8. Barcode stitching and rescaling and splitting into tiles
  9. Final profiling + barcoding pipeline

This creates 2 pre-setup steps, and at least 6 handoffs. Right now, each one is manual, with manual quality checks at each. For each one, we need to a) decide how we're going to do file handling and b) decide if and how we will determine success (quantitative cutoff? How/where do we check it? Manual visual inspection of something? Same thing) or if we think it can just proceed with something like an Amazon Lambda trigger.

OPTIONAL BUT REALLY NICE

Misc notes

bethac07 commented 4 years ago

Right now, uploading the pipeline for step 2-3 causes an infinite recursion loop; should fix that.

bethac07 commented 4 years ago

~Right now, uploading pipeline 7 triggers a new run, but we probably don't want to just go ahead and do that on initial uploading~ Actually this is fine, because it needs step 6 to run first to write something to the metadata file, so if it is uploaded before step 6 is run there's no issue.