Support multi-dimensional TMAPs in output

What Multi-dimensional TMAPs (e.g., operations on raw data) are currently second-class citizens for execution modes that are expected to produce outputs. For example, inference of segmentation models either produces summary stats of goodness of segmentation (via infer_with_pixels), or PNGs stripped down of the metadata required for interpretation and reuse (via plot_predictions). explore mode bypasses multi-dimensional TMAPs altogether.

Why The capability of easily re-using evaluated TMAPs (via inference or explore) is one of the key features of ML4H, and has already allowed us to perform "extrapolation" tasks where ML is used to infer a learned "rare" feature on an extended dataset (e.g., liver fat from standard MRI, LV mass and HRR from resting ECGs etc.). So far, we have fully supported only scalar features by exchanging CSV files, while ongoing work on segmentation and parameterization would require extensions to more complex multidimensional data.

How Allowing outputs in more sophisticated file formats (e.g., HDF5 as a start) that can handle multidimensional (semi-)structured data. TMAPs contain enough information to interpret the data and guide the storage. As producing multi-dimensional outputs is not always needed (and potentially slow), the behavior should be activated only by optional command line flags.

Acceptance Criteria

[ ] Multidimensional TMAPs are accounted for in inference and explore modes
[ ] Users can produce files containing evaluated multi-dimensional TMAPs preserving metadata
[ ] Users can re-use the output files and the same TMAPs as model inputs

broadinstitute / ml4h

Support multi-dimensional TMAPs in output #384