pHash correctness - Githubissues

I'm trying to use phash across different programming languages. For that purpose I consider the hashes produced by the python libraries https://github.com/thorn-oss/perception and https://github.com/JohannesBuchner/imagehash to be canonically correct.

The documentation of this project suggests that HasherConfig::new().hash_alg(HashAlg::Mean).preproc_dct().to_hasher() would produce a compatible hash, but in practice this is not the case. After some extensive experimentation, there are three changes I've identified to produce nearly the same results (Hamming Distance of ~4 on a 1024bit hash after these three changes):

[ ] HashAlg::Median, lifted straight from the old img_hash_median crate
[ ] Different bit order: reorder each byte in the resulting ImageHash so that the bit order 76543210 becomes 01234567.
[ ] Different conversion to grayscale: Pillow and Image use different conversion factors to go from RGB to grayscale. This has by far the lowest impact of the three (and from a quick search it seem the python versions also differ from the original C version of phash here)

I'll try to make the necessary PRs to make each of these options possible without changing the existing defaults.

qarmin / img_hash

pHash correctness #12