timhuff / canvas-phash

A Canvas-based pHash Implementation
11 stars 3 forks source link

canvas-phash

Note: This project is no longer actively maintained. The following README content needs to be updated (as it's been 8 years since this project was revised) but the general ideas hold. I've removed bluebird as a dependency, which might break backwards compatibility for some use-cases. If you're updating to v3 from v2, that should be the only breaking change.

Introduction

This is an implementation of a perceptual image hash, using Canvas written in 100% javascript/coffeescript. The algorithm used is described in Block Mean Value Based Image Perceptual Hashing and discussed in this StackOverflow question.

Difference From phash

I found the phash package to be a little error prone with respect to file I/O. This package, while the API is very similar, is different in some key ways.

Performance

I ran some preliminary tests to check the performance against phash and found it's fairly comparable.

Computing A Hash

The time taken ranged from just under 75ms to 150ms. For my tests, it generally took phash about 1-2 times longer to compute a hash as it took canvas-phash.

Finding the Hamming Distance

Typical time taken ranged from 0.2ms to 0.3ms. For my tests, it generally took canvas-phash about 2-3 times longer to find the hamming distance of two hashes. When comparing against a large collection of images, this is potentially significant. That being said, this library has not been optimized. Also, the actual hash created is 128 bytes long and takes up about 2-3 times more space.

API

Example Usage

(Another example exists in the repo)

phash = require 'canvas-phash'

Promise = require 'bluebird'
Promise.all([
    phash.getImageHash 'image.jpg'
    phash.getImageHash 'otherImage.jpg'
])
.spread (hash1, hash2)->
    dist = phash.getHammingDistance hash1, hash2

In the previous example, Promise.all is used to make the code readable. requireing bluebird is not necessary to use this package. The typical use-case would be to compute the hash of a single image via phash.getImageHash('image.jpg').then (hash)-> and compare that against a list of pre-existing hashes for close matches.