opencv / opencv_contrib

Repository for OpenCV's extra modules
Apache License 2.0
9.46k stars 5.77k forks source link

AverageHash image algorithm calculating comparison incorrectly #3295

Open joshuaherrera opened 2 years ago

joshuaherrera commented 2 years ago
System information (version)
Detailed description

The AverageHash image comparison algorithm is calculating a hamming distance that is too large when comparing the following two screenshots. The hamming distance calculated is 57, although one can see that the images are practically identical, apart from some text toward the bottom. I tried other open source AverageHash algorithms (for example imghash) and received hamming distances of between 0 and 3.

NOTE: Do not navigate to okta[.]ru[.]com as the domain may be malicious. or2crop or1crop

Steps to reproduce

One can utilize any of the opencv client libraries to reproduce the behavior. I have tried with gocv and opencv-python. Below is a simple python program utilizing opencv-python that can be used to reproduce the issue using the above screenshots.

import cv2
import sys

hasher = cv2.img_hash.AverageHash_create()
a1 = cv2.imread(sys.argv[1])
a2 = cv2.imread(sys.argv[2])
a1h = hasher.compute(a1)
a2h = hasher.compute(a2)
diff = hasher.compare(a1h,a2h)
print(f"image1 hash is: {a1h}")
print(f"image2 hash is: {a2h}")
print(f"image hamming distance is: {diff}")
Issue submission checklist
Kumataro commented 2 years ago

Reason

It is caused by image resizing algorithm.

https://github.com/opencv/opencv_contrib/blob/4.6.0/modules/img_hash/src/average_hash.cpp#L29

        cv::resize(input, resizeImg, cv::Size(8,8), 0, 0, INTER_LINEAR_EXACT);

The results to resize input images to 8x8 with each INTER_* methods is following.

Currently img_hash uses INTER_LINEAR_EXACT method. Resized images are not similar, and hash values are very difference.

https://docs.opencv.org/4.6.0/da/d54/group__imgproc__transform.html#ga5bb5a1fea74ea38e1a5445ca803ff121

(from left, to right) NEAREST, LINEAR, CUBIC, AREA, LANCZOS4, LINER_EXACT, NEAREST_EXACT.

image

how to fix(temporary)

With this input data, the patch below seems to mitigate the problem.

        cv::resize(input, resizeImg, cv::Size(8,8), 0, 0, INTER_LINEAR_AREA);
kmtr@kmtr-virtual-machine:~/work/studyC3295/A$ ./a.out A.jpg B.jpg
[[[0]
grayImg [ 43,  43,  43,  54,  54,  43,  43,  43;
  43,  43,  48,  49,  49,  47,  43,  43;
  43,  43,  52,  56,  41,  44,  43,  43;
  43,  43,  55,  69,  62,  50,  43,  43;
  43,  43,  54,  56,  51,  50,  43,  43;
  43,  43,  54,  66,  45,  45,  43,  43;
  43,  43,  47,  48,  48,  47,  43,  43;
  43,  43,  43,  44,  44,  43,  43,  43]
[[[1]
grayImg [ 43,  43,  43,  54,  54,  43,  43,  43;
  43,  43,  48,  49,  49,  47,  43,  43;
  43,  43,  52,  56,  41,  44,  43,  43;
  43,  43,  55,  69,  62,  50,  43,  43;
  43,  43,  54,  56,  51,  50,  43,  43;
  43,  43,  54,  59,  49,  45,  43,  43;
  43,  43,  47,  48,  48,  47,  43,  43;
  43,  43,  43,  44,  44,  43,  43,  43]
Hash A = [ 24,  28,  12,  60,  60,  12,  24,   0]
Hash B = [ 24,  60,  12,  60,  60,  28,  60,   0]
compare: 4

However, generally, robustness and performance are in a trade-off relationship.

I believe sometimes resizing with AREA for img-hash should not be better for performance reason.

So I think it is difficult to submit this suggest as MR.

Kumataro commented 2 years ago

how to fix(another way)

Another solution is to resize the image before calculation hash value.

Using mean() function seems work well for this images.

kmtr@kmtr-virtual-machine:~/work/studyC3295/B$ ./a.out
Hash A = [  0,   0,   0,   0,   0,  24,   0,   0]
Hash B = [255, 255, 231, 231, 231, 247, 255, 255]
compare: 57

[ INFO:0@0.025] global /home/kmtr/work/opencv/modules/core/src/parallel/registry_parallel.impl.hpp (96) ParallelBackendRegistry core(parallel): Enabled backends(3, sorted by priority): ONETBB(1000); TBB(990); OPENMP(980)
Hash A = [ 24,  28,  12,  60,  60,  12,  24,   0]
Hash B = [ 24,  28,  12,  60,  60,  28,  24,   0]
compare: 1

sample code is here. (This test code does not provide sufficient ROI range validation, so some error may happen).


#include "opencv2/core.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/img_hash.hpp"

#include <iostream>

using namespace cv;
using namespace cv::img_hash;
using namespace std;

Mat resize8x8(Mat &src)
{
    Mat dst(8,8,CV_8UC1,Scalar(0));

    Mat grayImg;
    cvtColor( src, grayImg, COLOR_BGR2GRAY );

    for(int y = 0 ; y < 8 ; y ++ )
    {
        for(int x = 0 ; x < 8 ; x ++ )
        {
            const Rect roi_rect = Rect
            (
                grayImg.cols * x / 8, grayImg.rows * y / 8,
                grayImg.cols / 8,     grayImg.rows / 8
            );
            const Mat roi(grayImg, roi_rect );

            dst.at<uint8_t>(y,x) = mean( roi )[0];
        }
    }
#if 0
{
static int n = 0;
imwrite(cv::format("dst_%d.png",n), dst);
n++;
}
#endif

    return dst;
}

int main(int argc, char **argv)
{
    const Ptr<ImgHashBase> func = AverageHash::create();

    Mat a = imread("A.jpg");
    Mat b = imread("B.jpg");
    Mat hashA, hashB;

    func->compute(a, hashA);
    func->compute(b, hashB);

    cout << "Hash A = " << hashA << endl;
    cout << "Hash B = " << hashB << endl;

    cout << "compare: " << func->compare(hashA, hashB) << endl << endl;

    a = resize8x8(a);
    b = resize8x8(b);

    func->compute(a, hashA);
    func->compute(b, hashB);

    cout << "Hash A = " << hashA << endl;
    cout << "Hash B = " << hashB << endl;

    cout << "compare: " << func->compare(hashA, hashB) << endl << endl;
    return 0;
}```
Avasam commented 11 months ago

Thanks @Kumataro , using the following pre-computed resizing in Python gives me much more sane results, and much smaller difference from imagehash's implementation using Pillow:

source = cv2.resize(source, (8, 8), interpolation=cv2.INTER_AREA)
capture = cv2.resize(capture, (8, 8), interpolation=cv2.INTER_AREA)

Now my different test images differ by at most 2 points. Where before it would differ by up to 12. when the image I was comparing against was sourced from a screen capture of my target comparison!!! (meaning OpenCV's implementation of pHash is really sensitive to resizing, which makes sense given your explanation and the fix)


To avoid having to install the whole package of contrib/extra modules, here's my final implementation:

import cv2
import numpy as np
import numpy._typing as npt
import scipy.fftpack
from cv2.typing import MatLike

def __cv2_phash(image: MatLike, hash_size: int = 8, highfreq_factor: int = 4):
    """Implementation copied from https://github.com/JohannesBuchner/imagehash/blob/38005924fe9be17cfed145bbc6d83b09ef8be025/imagehash/__init__.py#L260 ."""  # noqa: E501
    # OpenCV has its own pHash comparison implementation in `cv2.img_hash`, but it requires contrib/extra modules
    # and is innacurate unless we precompute the size with a specific interpolation.
    # See: https://github.com/opencv/opencv_contrib/issues/3295#issuecomment-1172878684
    #
    # pHash = cv2.img_hash.PHash.create()
    # source = cv2.resize(source, (8, 8), interpolation=cv2.INTER_AREA)
    # capture = cv2.resize(capture, (8, 8), interpolation=cv2.INTER_AREA)
    # source_hash = pHash.compute(source)
    # capture_hash = pHash.compute(capture)
    # hash_diff = pHash.compare(source_hash, capture_hash)

    img_size = hash_size * highfreq_factor
    image = cv2.cvtColor(image, cv2.COLOR_BGRA2GRAY)
    image = cv2.resize(image, (img_size, img_size), interpolation=cv2.INTER_AREA)
    dct = cast(npt.NDArray[np.float64], scipy.fftpack.dct(scipy.fftpack.dct(image, axis=0), axis=1))
    dct_low_frequency = dct[:hash_size, :hash_size]
    median = np.median(dct_low_frequency)
    return dct_low_frequency > median

source_hash = __cv2_phash(source)
capture_hash = __cv2_phash(capture)
hash_diff = np.count_nonzero(source_hash != capture_hash)