Open Justin-J-Miller opened 1 week ago
Is this really not available?
It looks like cluster.py at least theoretically has an option for this: https://github.com/bowman-lab/enspara/blob/735a3fb52b61a30268f07375376edfb5859ad99a/enspara/apps/cluster.py#L148C30-L153C1
It's supported for trajectories, but for features/h5 files it is currently disallowed:
Clustering on subsampled data is useful for both memory efficiency but also time to compute kmedoids updates. Would be nice to add the option to subsample featurized datasets at the point of clustering.