Closed danshev closed 5 years ago
Follow-up
Upon looking at the data types, I saw that OSHashType
(this library's data type for the hash) is a signed, 64-bit integer. So, I'm converting the hexadecimal value (calculated by ImageHash) into a OSHashType
directly.
Unfortunately, the resulting "distance" implies the two images (exact same) are very dissimilar.
Here is my code (and outputs) -- any thoughts?
// Calculate this library's pHash for a picture of me
let danImage = UIImage(named: "dan")
let danHash = myHasher.hashImage(danImage!, with: .pHash)
print(danHash)
>> -4467566139072037888
// Convert the Python, ImageHash pHash value (from the picture of me) into OSHashType
let unsignedVal = UInt64("bfbcf072a26094c3", radix: 16)!
let pythonHash = OSHashType(Int64(bitPattern: unsignedVal))
print(pythonHash)
>> -4630561941702535997
// Calculate this library's pHash for a picture of Britney Spears
let britImage = UIImage(named: "brit")
let britHash = myHasher.hashImage(britImage!, with: .pHash)
print(britHash)
>> 7825699685688937472
print(myHasher.hashDistance(danHash, to: britHash))
>> 26
// Calculate the distance between the Python pHash value of a picture of me
// and this library's pHash value of a picture of me
print(myHasher.hashDistance(pythonHash, to: danHash))
>> 35 ... so the same pic of me is more different than a picture of me and Britney Spears...?
I'll close out this issue:
There are a number of differences between this library and the Python, ImageHash library which make the pHash values incompatible; examples (not exhaustive):
The functions to resize the input images down to 32x32 pixel images are different; as a result, the RGB tuples are demonstrably different, which then cascades down to the grayscale calculations, the DCT calculations, etc., etc.
The grayscale calculation is different:
convert()
function, uses the ITU-R 601-2 luma transform (source)Just some feedback from my side:
Thanks for the feedback!
I was able to bring the this library's resizing better in-line with the ImageHash library by learning that the ImageHash library (along with most Python projects doing image manipulation) uses Pillow, which uses a Lanczos Scale transform as the default for its resize
function.
Pillow 32x32 resize
Original CocoaImageHashing resize
Converted CocoaImageHashing resize
Note: to best-appreciate the difference, it's best to download the three images and scroll through them in a file browser (... you'll be able to use the thumbnail preview to see how they change):
I replaced the RGBABitmapDataForResizedImageWithWidth()
code with:
{
UIImage *baseImage = [UIImage imageWithData:self];
if (!baseImage) {
return nil;
}
// Calculate the necessary InputScaleKey and InputAspectRatioKey values to resize to 32x32
CGFloat scaleFactor = 32.0 / baseImage.size.height;
CGFloat aspectRatio = baseImage.size.height / baseImage.size.width;
CIImage *input_ciimage = [[CIImage alloc] initWithImage:baseImage];
CIImage *output_ciimage = [
[CIFilter filterWithName:@"CILanczosScaleTransform" keysAndValues:kCIInputImageKey, input_ciimage, kCIInputScaleKey, [NSNumber numberWithFloat:scaleFactor], kCIInputAspectRatioKey, [NSNumber numberWithFloat:aspectRatio], nil] outputImage];
CIContext *context = [CIContext contextWithOptions:nil];
CGImageRef output_cgimage = [context createCGImage:output_ciimage fromRect:[output_ciimage extent]];
CFDataRef pixels = CGDataProviderCopyData(CGImageGetDataProvider(output_cgimage));
NSData *finalData = (__bridge NSData *)pixels;
CGImageRelease(output_cgimage);
CFRelease(pixels);
return finalData;
}
I changed the grayscale calculation, which was easy enough.
Unfortunately, I haven't been able to unwind how you've implemented the DCT, which is definitely producing a different result than the SciPy (default: Type II) implementation
Intersetingly enough, however, I also found some fairly glaring mistakes with the ImageHash
implementation -- most notably using median (vs average) in the pixel comparison.
Really appreciative of your efforts, thank you!
The Python ImageHash library outputs pHash values like: bfbcf072c260d0c3
How does one get a similar value from this API? Or: How should I convert that sort of pHash value to this library’s format?
If possible, I'm trying to:
Possible? Thanks again