Sum of stability scores as a measure of cluster quality?

I am working with dataset from a domain of which I'm not an expert. This means that I have neither labels nor ground truth clustering available to me. I am trying to figure out a good way in which I could assess the quality of clusters which I obtain upon varying the minimum number of samples and minimum cluster size.

Clustering metrics which I am aware of either assume that points in clusters are normally distributed, e.g. Silhouette Score and F-Score, or rely on a reference/ground truth labeling, e.g. adjusted mutual information and completeness scores.

I wonder if one could define a quality metric using stability scores of the flattened clusters (which I believe are available from cluster_persistence_ field). If my understanding is correct, these scores are already "normalised" for the sizes of different clusters in the sense that they are computed by adding relative excess of mass over all points assigned to these clusters. Therefore, a straightforward sum of all stability scores seems like a reasonable definition of such a metric. The higher the sum, the better the clustering; and one can search in the hyper-parameter space to find a maximal value.

It would be great to hear your thoughts on this. For HDBSCAN in particular, shouldn't this or something else based on cluster stability be a more suitable (and easier to explain and more efficiently computable) metric than DBCV?

scikit-learn-contrib / hdbscan

Sum of stability scores as a measure of cluster quality? #196