Open tacitvenom opened 1 year ago
I think the catch may be having a cluster of size 1; that would definitely break something.
In this toy example, there are two clusters (cluster id 0 and 1). The original data had around 3000 clusters though.
It appears that ValueError is present if have cluster size of 2, which I believe is not expected behaviour - you can test it using adapted toy model from above:
validity_index(np.random.rand(5, 2), np.array([-1, 1, 1, 0, 0]))
(Noise cluster -1 size does not impact)
Issue is prohibiting me from using the useful metric for a datasource that has several small (correct) clusterings.
I'm having the same issue. It looks like it is indeed coming from clusters of size 2. Is there any update on a fix for that, or do I need to change the hyperparameters so that I don't get clusters that small?
Having the same thing still in 2023. Serious bug with size-2 clusters
For a clustering usecase, I tried different parameters and while calculating validity index, I run into the following ValueError:
Following is a toy example, I could reproduce with: