Closed avouros closed 5 years ago
I agree with you @avouros I did modify this to use a precomputed distance matrix and your tip helped a lot.
If np.shape(neighbors)[0]
is taken instead of np.shape(neighbors)[1]
(as it should be), the resultant index has always a low value (hardly never a positive one) ... even when evaluating good clustering results as the one obtained running hdbscan with the noisy moons dataset (provided by the author).
Does anyone know why?
@onofricamila I have the same observation. Do you have any insights on this? I am wondering if something is wrong.
Thank you very much for providing the code for the DBCV index.
I noticed in the
_core_dist
function that you have defined the number of neighbours (n_neighbors
) to equal the dimensionality of the datasetnp.shape(neighbors)[1]
(Line 57 of the DBCV.py) shouldn't this have beennp.shape(neighbors)[0]
?Also based on the formula of Moulavi et al (definition 1, equation 3.1) Line 62 of your code shouldn't have been
core_dist = (numerator / (n_neighbors
-1)) ** (-1/n_features)
?