commonsmachinery / blockhash

blockhash.io
MIT License
88 stars 28 forks source link

Implementing video hashing #12

Closed jonasob closed 9 years ago

jonasob commented 9 years ago

This patch set implements video hashing using OpenCV to extract frames from the video files. There is no automatic detection of video files, '-V' must be passed to blockhash to make it use the video algorithm. Hash sizes are the same for videos and images, but for videos, the hash is comprised of four separate hashes of N/4 length concatenated. These four hashes are taken from frame 10, F-10, F_0.35 and F_0.70 where F is the total number of frames in a video.

There is currently no test set. This is a quick demonstrator of three hashes generated using the same video file as input but with variations:

0e0f06dcf841e30a070f4c5cec41c31cfb009f10f250e9c0ffffe0d87e04fc40
0e0f06dcf841e30a070f4c5cec41c31cfb009f10b2d0e9c0ffffe0d87e04fc40
0e0f16d8f841e30a070f4e4cec41c31cfb009f10b2d0e9c0ffffe0d87e04fc40

The first hash is an MPEG2 in 25 fps with 1164 kb/s bitrate. The second is the same video recoded to MPEG1 in 29.97 fps with 1390 kb/s. The third and final hash is recoded to MPEG2 in 23.98 fps with 421 kb/s bitrate.

This algorithm shall be considered highly experimental and is used as the base algorithm to validate other algorithms against.