ameingast / cocoaimagehashing

Perceptual Image Hashing for macOS, iOS, tvOS and watchOS
Other
262 stars 66 forks source link

[Question] Cross-library capable? #21

Closed danshev closed 5 years ago

danshev commented 5 years ago

Really appreciative of your efforts, thank you!

The Python ImageHash library outputs pHash values like: bfbcf072c260d0c3

How does one get a similar value from this API? Or: How should I convert that sort of pHash value to this library’s format?

If possible, I'm trying to:

  1. Use the iPhone camera to ingest image data
  2. Calculate a pHash value (using this library)
  3. Calculate distance between the pHash value from #2 and the pHash value computed by ImageHash, provided separately.

Possible? Thanks again

danshev commented 5 years ago

Follow-up

Upon looking at the data types, I saw that OSHashType (this library's data type for the hash) is a signed, 64-bit integer. So, I'm converting the hexadecimal value (calculated by ImageHash) into a OSHashType directly.

Unfortunately, the resulting "distance" implies the two images (exact same) are very dissimilar.

Here is my code (and outputs) -- any thoughts?

// Calculate this library's pHash for a picture of me
let danImage = UIImage(named: "dan")
let danHash = myHasher.hashImage(danImage!, with: .pHash)
print(danHash)
>> -4467566139072037888

// Convert the Python, ImageHash pHash value (from the picture of me) into OSHashType
let unsignedVal = UInt64("bfbcf072a26094c3", radix: 16)!
let pythonHash = OSHashType(Int64(bitPattern: unsignedVal))
print(pythonHash)
>> -4630561941702535997

// Calculate this library's pHash for a picture of Britney Spears
let britImage = UIImage(named: "brit")
let britHash = myHasher.hashImage(britImage!, with: .pHash)
print(britHash)
>> 7825699685688937472

print(myHasher.hashDistance(danHash, to: britHash))
>> 26

// Calculate the distance between the Python pHash value of a picture of me
//       and this library's pHash value of a picture of me
print(myHasher.hashDistance(pythonHash, to: danHash))
>> 35 ... so the same pic of me is more different than a picture of me and Britney Spears...?
danshev commented 5 years ago

I'll close out this issue:

There are a number of differences between this library and the Python, ImageHash library which make the pHash values incompatible; examples (not exhaustive):

  1. The functions to resize the input images down to 32x32 pixel images are different; as a result, the RGB tuples are demonstrably different, which then cascades down to the grayscale calculations, the DCT calculations, etc., etc.

  2. The grayscale calculation is different:

    • This library uses a simple (R + B + G) / 3 calculation
    • Python ImageHash, which uses PIL / Pillow's convert() function, uses the ITU-R 601-2 luma transform (source)
ameingast commented 5 years ago

Just some feedback from my side:

Thanks for the feedback!

danshev commented 5 years ago

I was able to bring the this library's resizing better in-line with the ImageHash library by learning that the ImageHash library (along with most Python projects doing image manipulation) uses Pillow, which uses a Lanczos Scale transform as the default for its resize function.

brit-32-PIL Pillow 32x32 resize

brit-32-pre Original CocoaImageHashing resize

brit-32-post Converted CocoaImageHashing resize

Note: to best-appreciate the difference, it's best to download the three images and scroll through them in a file browser (... you'll be able to use the thumbnail preview to see how they change):

ezgif com-video-to-gif-2

I replaced the RGBABitmapDataForResizedImageWithWidth() code with:

{
    UIImage *baseImage = [UIImage imageWithData:self];
    if (!baseImage) {
        return nil;
    }

    // Calculate the necessary InputScaleKey and InputAspectRatioKey values to resize to 32x32
    CGFloat scaleFactor = 32.0 / baseImage.size.height;
    CGFloat aspectRatio = baseImage.size.height / baseImage.size.width;

    CIImage *input_ciimage = [[CIImage alloc] initWithImage:baseImage];
    CIImage *output_ciimage = [
                               [CIFilter filterWithName:@"CILanczosScaleTransform" keysAndValues:kCIInputImageKey, input_ciimage, kCIInputScaleKey, [NSNumber numberWithFloat:scaleFactor], kCIInputAspectRatioKey, [NSNumber numberWithFloat:aspectRatio], nil] outputImage];
    CIContext *context = [CIContext contextWithOptions:nil];
    CGImageRef output_cgimage = [context createCGImage:output_ciimage fromRect:[output_ciimage extent]];
    CFDataRef pixels = CGDataProviderCopyData(CGImageGetDataProvider(output_cgimage));

    NSData *finalData = (__bridge NSData *)pixels;

    CGImageRelease(output_cgimage);
    CFRelease(pixels);
    return finalData;
}

I changed the grayscale calculation, which was easy enough.

Unfortunately, I haven't been able to unwind how you've implemented the DCT, which is definitely producing a different result than the SciPy (default: Type II) implementation

Intersetingly enough, however, I also found some fairly glaring mistakes with the ImageHash implementation -- most notably using median (vs average) in the pixel comparison.