commonsmachinery / blockhash

blockhash.io
MIT License
87 stars 28 forks source link

Algorithm for video hashing #13

Open jonasob opened 8 years ago

jonasob commented 8 years ago

We're looking to extend the blockhash software to cover videos in addition to the images it already processes. There are existing alternatives for this, such as the pHash DCT hashes which were originally conceived for videos but then adapted for images. The first rough draft of what this could look like in blockhash is done in https://github.com/commonsmachinery/blockhash/pull/12

That version is dirt simple in that it only picks four key frames from the video (one at the beginning, one at the end, and two at defined points in between), does a standard 64-bit blockhash of those videos and the concatenates them together to a 256 bit hash. I haven't done the cross-compare yet to evaluate accuracy or false positives/negatives, but it seems to work, and any alternative to this will surely only be an improvement. So that will be the "baseline" comparison :)

Problems encountered so far with videos include:

That said, it's not obnoxiously slow to step through videos, but of course, it depends heavily on the length of the video itself and the processing power available. But we're talking about seconds, rather than milliseconds.

jonasob commented 8 years ago

Ah, and here's another interesting tidbit when it comes to video files:

jonasob commented 8 years ago

Current statistics here (my opencv version, compared with ivan's superiour ffmpeg version):

bbhopesh commented 4 years ago

@jonasob Take a look this video hashing algorithm that Facebook recently open sourced.