KilianB / JImageHash

Perceptual image hashing library used to match similar images
MIT License
397 stars 80 forks source link

Another Good Image Hash Method #65

Open mrlimbic opened 2 years ago

mrlimbic commented 2 years ago

Just thought you might also like this image hash method paper.

"AN IMAGE SIGNATURE FOR ANY KIND OF IMAGE" by H. Chi Wong, Marshall Bern, and David Goldberg

http://www.cs.cmu.edu/~hcwong/Pdfs/icip02.ps

BigPanda97 commented 2 years ago

The file should be a PDF according to the website, but it is a “.ps” file. Are you sure it's clean? In a hex editor, you can see that it contains some kind of strange commands/codes.

mrlimbic commented 2 years ago

It's in a folder with other PDFs but yes that particular file is postscript. On Mac OS it opens converted to a PDF in preview. I think it's legit as it's an academic site.

I came across it because it was referenced in a python image matching tool as "goldberg" hash. Apparently it works very well.

https://github.com/ProvenanceLabs/image-match/blob/master/image_match/goldberg.py

The reason I am interested is for matching frames in video. Most of the image hash methods don't work very well in my tests so far because obviously one video frame is often so similar to the previous & next that matches happen much too frequently.

One of the things that algorithm does is add a moire filter before hashing. That is good if you need to distinguish very similar images like I want. A slight movement shouldn't match as much.

BigPanda97 commented 2 years ago

Have you already seen https://github.com/facebook/ThreatExchange? It was developed by Facebook and contains an image hashing algorithm (PDQ), a video hashing algorithm (TMK) and they implement another way of hashing videos soon. (vPDQ)

Explained in detail here: https://github.com/facebook/ThreatExchange/blob/main/hashing/hashing.pdf

mrlimbic commented 2 years ago

I'll test out how noisy PDQ is compared to other hashes. My simple "hash noise" test is just to compare hashes of current frame to previous frame.

You can see how noisy perceptive hash from this JImageHash library is. The genuinely very different frames stand out (where a cut to a different shot is) but scene detection is not enough for me. I need much less noise from similar but not the exact same frame.

https://drive.google.com/file/d/1TBwNw1Ymh_iRSTsRAOBz9BSv8VnL8BWm/view?usp=sharing