erdogant / clustimage

clustimage is a python package for unsupervised clustering of images.
https://erdogant.github.io/clustimage
Other
92 stars 8 forks source link

Wrong distance metric used with hash features? #24

Closed dlindenkreuz closed 1 year ago

dlindenkreuz commented 1 year ago

As far as I know, the euclidean distance metric does not make sense for comparing hashes as features. One would use the hamming distance instead.

However, the examples show pHash in combination with euclidean. Is this intended or did I miss something?

erdogant commented 1 year ago

The default method for clustimage is set to PCA. When using hashes, it is wise to also think about the distance metric. Hamming distance would be a better choice. Where did you see this example?

dlindenkreuz commented 1 year ago

I can't find it anymore. Must have been some blog post or Gist. My bad! Closing.