broadinstitute / cellprofiler-on-Terra

Run CellProfiler on Terra. Contains workflows that enable a full end-to-end Cell Painting pipeline.
BSD 3-Clause "New" or "Revised" License
7 stars 4 forks source link

CellProfiler on Terra

WDL workflows and scripts for running a CellProfiler pipeline on Google Cloud hardware. Includes workflows for all steps of a full Cell Painting pipeline.

Works well in Terra, and will also work on any Cromwell server that can run WDLs. Currently specific to a Google Cloud backend. (We are open to supporting more backends, specifically cloud storage locations, in the future, including AWS and Azure.)

You can see these workflows in action and try them yourself in Terra workspace cellpainting!

Three pipelines:

  1. Cell Painting

    • All the workflows necessary to run an end-to-end Cell Painting pipeline, starting with raw images and ending with extracted features, both in database format and aggregated as CSV files.
    • Appropriate for datasets of arbitrary size.
    • Scatters the time-consuming analysis steps over many VMs in parallel. By default, a dataset is split into individual wells, and each well is run on a separate VM.
  2. Cytominer

    • Run the cytominer-database ingest step to create a SQLite database containing all the extracted features.
    • Run the aggregation step from pycytominer to create CSV files.
  3. CellProfiler (distributed or single VM)

How to run these workflows yourself

These workflows are all publicly available, and hosted in Dockstore. From there, you can import and run the workflows in Terra or any other place you like to run WDL workflows.

You can clone the Terra workspace cellpainting, which is conveniently preconfigured to run on three plates of sample data, if you just want to give it a try.