christopherjenness / DBCV

Python implementation of Density-Based Clustering Validation
MIT License
154 stars 41 forks source link

Results don't match with reference implementation in Matlab #25

Open gschlake opened 1 year ago

gschlake commented 1 year ago

Hello,

Thanks for this implementation of the DBCV in Python. However, the results with this method don't match with the reference implementation in Matlab by Moulavi et al. This is partly because your implementation treats outliers as a cluster, but even fixing this leads to completly different results. The first example dataset of the reference Implementation will give values of -0.2986 for your Implementation, 0.5074 for your implementation with the correct outlier processing and 0.6149 for the reference implementation.

I think these quite significant difference discourage from using this implementation in scientific contexts until this is fixed.

kaitlynng9 commented 2 months ago

For anyone this might help, I came across another implementation. See this library that matches the Matlab implementation and an issue raised in that repo comparing their library to this one.