Open joshuaherrera opened 2 years ago
It is caused by image resizing algorithm.
https://github.com/opencv/opencv_contrib/blob/4.6.0/modules/img_hash/src/average_hash.cpp#L29
cv::resize(input, resizeImg, cv::Size(8,8), 0, 0, INTER_LINEAR_EXACT);
The results to resize input images to 8x8 with each INTER_* methods is following.
Currently img_hash uses INTER_LINEAR_EXACT method. Resized images are not similar, and hash values are very difference.
(from left, to right) NEAREST, LINEAR, CUBIC, AREA, LANCZOS4, LINER_EXACT, NEAREST_EXACT.
With this input data, the patch below seems to mitigate the problem.
cv::resize(input, resizeImg, cv::Size(8,8), 0, 0, INTER_LINEAR_AREA);
kmtr@kmtr-virtual-machine:~/work/studyC3295/A$ ./a.out A.jpg B.jpg
[[[0]
grayImg [ 43, 43, 43, 54, 54, 43, 43, 43;
43, 43, 48, 49, 49, 47, 43, 43;
43, 43, 52, 56, 41, 44, 43, 43;
43, 43, 55, 69, 62, 50, 43, 43;
43, 43, 54, 56, 51, 50, 43, 43;
43, 43, 54, 66, 45, 45, 43, 43;
43, 43, 47, 48, 48, 47, 43, 43;
43, 43, 43, 44, 44, 43, 43, 43]
[[[1]
grayImg [ 43, 43, 43, 54, 54, 43, 43, 43;
43, 43, 48, 49, 49, 47, 43, 43;
43, 43, 52, 56, 41, 44, 43, 43;
43, 43, 55, 69, 62, 50, 43, 43;
43, 43, 54, 56, 51, 50, 43, 43;
43, 43, 54, 59, 49, 45, 43, 43;
43, 43, 47, 48, 48, 47, 43, 43;
43, 43, 43, 44, 44, 43, 43, 43]
Hash A = [ 24, 28, 12, 60, 60, 12, 24, 0]
Hash B = [ 24, 60, 12, 60, 60, 28, 60, 0]
compare: 4
However, generally, robustness and performance are in a trade-off relationship.
I believe sometimes resizing with AREA for img-hash should not be better for performance reason.
So I think it is difficult to submit this suggest as MR.
Another solution is to resize the image before calculation hash value.
Using mean() function seems work well for this images.
kmtr@kmtr-virtual-machine:~/work/studyC3295/B$ ./a.out
Hash A = [ 0, 0, 0, 0, 0, 24, 0, 0]
Hash B = [255, 255, 231, 231, 231, 247, 255, 255]
compare: 57
[ INFO:0@0.025] global /home/kmtr/work/opencv/modules/core/src/parallel/registry_parallel.impl.hpp (96) ParallelBackendRegistry core(parallel): Enabled backends(3, sorted by priority): ONETBB(1000); TBB(990); OPENMP(980)
Hash A = [ 24, 28, 12, 60, 60, 12, 24, 0]
Hash B = [ 24, 28, 12, 60, 60, 28, 24, 0]
compare: 1
sample code is here. (This test code does not provide sufficient ROI range validation, so some error may happen).
#include "opencv2/core.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/img_hash.hpp"
#include <iostream>
using namespace cv;
using namespace cv::img_hash;
using namespace std;
Mat resize8x8(Mat &src)
{
Mat dst(8,8,CV_8UC1,Scalar(0));
Mat grayImg;
cvtColor( src, grayImg, COLOR_BGR2GRAY );
for(int y = 0 ; y < 8 ; y ++ )
{
for(int x = 0 ; x < 8 ; x ++ )
{
const Rect roi_rect = Rect
(
grayImg.cols * x / 8, grayImg.rows * y / 8,
grayImg.cols / 8, grayImg.rows / 8
);
const Mat roi(grayImg, roi_rect );
dst.at<uint8_t>(y,x) = mean( roi )[0];
}
}
#if 0
{
static int n = 0;
imwrite(cv::format("dst_%d.png",n), dst);
n++;
}
#endif
return dst;
}
int main(int argc, char **argv)
{
const Ptr<ImgHashBase> func = AverageHash::create();
Mat a = imread("A.jpg");
Mat b = imread("B.jpg");
Mat hashA, hashB;
func->compute(a, hashA);
func->compute(b, hashB);
cout << "Hash A = " << hashA << endl;
cout << "Hash B = " << hashB << endl;
cout << "compare: " << func->compare(hashA, hashB) << endl << endl;
a = resize8x8(a);
b = resize8x8(b);
func->compute(a, hashA);
func->compute(b, hashB);
cout << "Hash A = " << hashA << endl;
cout << "Hash B = " << hashB << endl;
cout << "compare: " << func->compare(hashA, hashB) << endl << endl;
return 0;
}```
Thanks @Kumataro , using the following pre-computed resizing in Python gives me much more sane results, and much smaller difference from imagehash
's implementation using Pillow:
source = cv2.resize(source, (8, 8), interpolation=cv2.INTER_AREA)
capture = cv2.resize(capture, (8, 8), interpolation=cv2.INTER_AREA)
Now my different test images differ by at most 2 points. Where before it would differ by up to 12. when the image I was comparing against was sourced from a screen capture of my target comparison!!! (meaning OpenCV's implementation of pHash is really sensitive to resizing, which makes sense given your explanation and the fix)
To avoid having to install the whole package of contrib/extra modules, here's my final implementation:
import cv2
import numpy as np
import numpy._typing as npt
import scipy.fftpack
from cv2.typing import MatLike
def __cv2_phash(image: MatLike, hash_size: int = 8, highfreq_factor: int = 4):
"""Implementation copied from https://github.com/JohannesBuchner/imagehash/blob/38005924fe9be17cfed145bbc6d83b09ef8be025/imagehash/__init__.py#L260 .""" # noqa: E501
# OpenCV has its own pHash comparison implementation in `cv2.img_hash`, but it requires contrib/extra modules
# and is innacurate unless we precompute the size with a specific interpolation.
# See: https://github.com/opencv/opencv_contrib/issues/3295#issuecomment-1172878684
#
# pHash = cv2.img_hash.PHash.create()
# source = cv2.resize(source, (8, 8), interpolation=cv2.INTER_AREA)
# capture = cv2.resize(capture, (8, 8), interpolation=cv2.INTER_AREA)
# source_hash = pHash.compute(source)
# capture_hash = pHash.compute(capture)
# hash_diff = pHash.compare(source_hash, capture_hash)
img_size = hash_size * highfreq_factor
image = cv2.cvtColor(image, cv2.COLOR_BGRA2GRAY)
image = cv2.resize(image, (img_size, img_size), interpolation=cv2.INTER_AREA)
dct = cast(npt.NDArray[np.float64], scipy.fftpack.dct(scipy.fftpack.dct(image, axis=0), axis=1))
dct_low_frequency = dct[:hash_size, :hash_size]
median = np.median(dct_low_frequency)
return dct_low_frequency > median
source_hash = __cv2_phash(source)
capture_hash = __cv2_phash(capture)
hash_diff = np.count_nonzero(source_hash != capture_hash)
System information (version)
Detailed description
The AverageHash image comparison algorithm is calculating a hamming distance that is too large when comparing the following two screenshots. The hamming distance calculated is 57, although one can see that the images are practically identical, apart from some text toward the bottom. I tried other open source AverageHash algorithms (for example imghash) and received hamming distances of between 0 and 3.
NOTE: Do not navigate to okta[.]ru[.]com as the domain may be malicious.
Steps to reproduce
One can utilize any of the opencv client libraries to reproduce the behavior. I have tried with gocv and opencv-python. Below is a simple python program utilizing
opencv-python
that can be used to reproduce the issue using the above screenshots.Issue submission checklist