I've encountered an unexpected behavior in the clustering of my text corpus. The corpus contains the following texts:
text1
text2
text3
Based on the distance metric used, it is observed that dist(text1, text2) < dist(text1, text3). However, the clustering algorithm has grouped text1 with text3 and identified text2 as an isolated point. This result is contrary to my expectations.
I've encountered an unexpected behavior in the clustering of my text corpus. The corpus contains the following texts:
text1
text2
text3
Based on the distance metric used, it is observed thatdist(text1, text2) < dist(text1, text3)
. However, the clustering algorithm has groupedtext1
withtext3
and identifiedtext2
as an isolated point. This result is contrary to my expectations.Parameters Used:
min_cluster_size = 2
min_samples = 2
distance: Euclidean distance