hubmapconsortium / codex-pipeline

CODEX data processing code
GNU General Public License v3.0
10 stars 4 forks source link

Write some descriptive measures to a text or JSON file at end of pipeline #17

Closed mruffalo closed 3 years ago

mruffalo commented 3 years ago

The outputs of this pipeline can vary wildly in the number of cells identified by the segmentation method and in the pixel dimensions of the stitched image -- these are functions of the input data and segmentation parameters.

It would be useful to have some of these measures written to a file that's human- and machine-readable, probably in JSON or plain text format. This would make it significantly more convenient to estimate memory usage and runtime of downstream processing steps like SPRM, without having to do something like "open TIFF segmentation mask, count unique cell indices" to obtain the cell count.

This information should include:

This could eventually be useful to schedule SPRM and other downstream jobs on compute nodes which would be known to have enough memory.