thorn-oss / perception

Perceptual hashing tools for detecting child sexual abuse material
https://perception.thorn.engineering/
Apache License 2.0
173 stars 16 forks source link

Hashing triangles #6

Open DonaldTsang opened 4 years ago

DonaldTsang commented 4 years ago

This might sound off, but is there a way of hashing equilateral triangles? I raised an issue on https://github.com/pippy360/transformationInvariantImageSearch/issues/4#issuecomment-471166311 regarding using triangular hashes to aid in affine invariant image hashing through triangle networks (see https://pippy360.github.io/transformationInvariantImageSearch/ for examples) Maybe breakdown the triangle into a grid of "trixels" (triangular pixels) and do the usual hashes that are done with normal square pixels with some alterations? (length 8 grid = 256 trixels) p_189_grid

faustomorales commented 4 years ago

This sounds interesting. I don't immediately know how to adapt existing hashes this way -- but that doesn't mean it's impossible, of course. That said, if you come up with a test implementation, it would be nice to see how the benchmarking results shake out. You may want to include more extreme rotations in the test transformations in order to get a more complete evaluation.

DonaldTsang commented 4 years ago

Could you explain to me how each of the current square hashes work? I know that dHash is undoable for triangles but what about aHash and pHash/wHash? For aHash it is just detecting trixels that are lighter than the verage and for pHash/wHash there are systems that allows for Cosine/Wavelet transform that are done on triangles. See:

Also Marr-Hildreth, BlockMean, ColorMoment and PDQHash are relatively new, so there may be explanation as to how it should be done for trixels.

DonaldTsang commented 4 years ago

According to http://www.gingerling.co.uk/image-and-attribution-identification-game-using-blockhash-and-elog-io/ Block Mean hashes like blockhash.io actually uses median like how aHash operates, but with different params, in that case both aHash and Block Mean hash can be done in the same way. For Marr–Hildreth it is all about detecting lines in the square (or in this case a triangle) https://en.wikipedia.org/wiki/Marr%E2%80%93Hildreth_algorithm and I think it is possible to modify this into detecting edges in triangles, through 4-trixel or even 16-trixel kernels. PDQ hashes is basically a convolution that requires a 2 dimensional DCT according to https://github.com/facebook/ThreatExchange/blob/master/hashing/hashing.pdf Color Moment is still unknown.