Open redaready opened 9 years ago
Interesting, I've had the Harr wavelet suggested as well. However I cannot base my implementation on pHash because its license is incompatible with img_hash
's and could possibly cause conflicts. If you can find a high-level description of the algorithm somewhere, that would be very helpful.
Recently after some thought and research I've figured out how I want to implement this. The Marr wavelet is a continuous function and so needs to be discretely sampled for the transform. Clicking around relevant Wikipedia articles, I think pHash's implementation has something to do with the Difference of Gaussians operation which effectively performs edge detection on the image.
I plan on rewriting the API to allow the Discrete Cosine Transform as a preprocessing step for the mean, gradient and double-gradient hash types, and add Difference of Gaussians as another preprocessing step that can apply to the Blockhash algorithm as well as the others. The image
crate provides a Gaussian blur function so all that's necessary is to implement the difference part which should be easy.
You are welcome to copy my implementation of the Marr wavelet from https://github.com/c-edw/perceptual-hash-poc.
I'm still struggling with basic vector algebra but I'm not sure if that code is implementing a Marr wavelet transform. It looks like it's just multiplying the pixel values with the scaled wavelet. My uneducated guess is that what you've implemented there is just a weighted average--which might well be useful in its own right but it's not the typical usage of this wavelet that I've seen in my research.
The image
crate already provides a Gaussian blur function which I can use to implement Difference of Gaussians, though I don't actually know what its usefulness will be with the existing hash algorithms because a mostly white image with a handful of black pixels will scale down to a flat-grey image with few features. I imagine the Blockhash algorithm would be the only actually useful one to combine with DoG as a preprocessing step.
The code there is a translation of the formula seen here.
It's possible that I missed something, but the algorithm seems to generate correct hashes. As an example, the code compares two images at largely different resolutions. The Hamming distance when using the Marr wavelet is zero.
Yes but all that does is produce the value of the wavelet at that point, then you just multiply it with the corresponding luminance values before passing it to a normal averaging hash. As far as I can tell, you've basically created a weighted mean hash that pays more attention to the center and edges and less to the intermediate areas. It's not a wavelet transform that I can see.
If you're testing with images of the exact content and aspect ratio, just different resolutions, I don't see why this hash would perform much differently than a regular mean hash. I'm sure it works fine for that case but I don't see any added benefit.
I don't see why this hash would perform much differently than a regular mean hash.
Well, it does. The average (aHash) algorithm yields a Hamming distance of 2 in that scenario.
I'll try it. Like I said I haven't looked at the pHash source in order to avoid any IP conflicts since it has an incompatible license (like how ReactOS can't allow anyone to contribute who has seen Windows source code). That and it's no fun just copying someone else's implementation.
Reading more into the pHash site without looking at their sources, I still think they're using basically the same approach as the one I proposed: https://www.phash.org/docs/design.html
We have developed a new image hash based on the Marr wavelet that computes a perceptual hash based on edge information with particular emphasis on corners. [...] Basically, the edge information attained from the wavelet is compressed into a fixed length hash of 72 bytes. Binary quantization allows for relatively fast hamming distance computation between hashes.
I think they're just performing edge enhancement using Difference of Gaussians (or Laplace of Gaussian) and then counting the edge pixels somehow, probably breaking the image into blocks like the Blockhash.io algorithm.
@c-edw I'm still considering your approach but maybe as a generalized API for weighted sampling with the Marr wavelet as one of the possible weighting functions.
If we are on the hash variant train: Take a look at these sample Python implementation https://github.com/thorn-oss/perception#supported-hashing-algorithms
it match better pictures
http://www.phash.org/demo/