gfieldGG / videohash

Near Duplicate Video Detection (Perceptual Video Hashing) - Get a 256-bit comparable hash value for any video.
MIT License
0 stars 0 forks source link

[General question] How do you compare hashes in the real world? #15

Open grapemix opened 3 months ago

grapemix commented 3 months ago

Thanks for sharing your hard work. I am wondering how do you compare a particular hash with N existed hash in the real world?

I was using postgres and stored the hash as BigInt. But the BigInt can only store 8 bytes and Pg teams said they won't increase the limit because the limit is from hardware constraints. While the pg's spgist idx can only support 64 bits max w/ BigInt, I saw you upscale the hash from 64 bits to 256 bits as default. I wonder how do you compare a particular hash with N existed hash in the real world? I tried to downscale to 64 bits and get some problems.

Are you using spgist index for storage and comparison? Or do you recommend milvus?

Also, have you take a look for https://towhee.io/tasks/detail/pipeline/video-copy-detection? It takes ML approach to solve this similar problem. Since you are an expert on hashing videos, what is your opinions on it? Especially on the accuracy part. Thanks a lot.

grapemix commented 3 months ago

For the record, I cannot remove the bug label.