Given patchxdimension activations and segmentation map, return more granular labels
High-level design decisions need to be made about whether to compute our own segmentation maps, use pre-computed datasets, and returning hierarchical labels (e.g. eye -> face -> person).