Open mdruiter opened 1 year ago
The problem is in the 'definition' of neighbor_matrix
: _compute_distance_and_neighbor_matrix
returns indexes within the cluster, but _prob_distances_ev
treats the numbers as being global.
Hey @mdruiter - thanks for noting the issue and where it is occurring.
Are you able to submit a fix in a pull request?
I think I have found a bug that occurs when passing some
cluster_labels
.When I completely reverse the order of all input (
data
andcluster_labels
), and I reverse the result (local_outlier_probabilities
), I would expect the same numbers. This does happen as long as allcluster_labels
values are equal. Once I have two (really separate) clusters, the results change when flipped! An extra indication that things go wrong (IMHO): the second cluster's neighbor numbers are in the first cluster!A small reproduction example: