Closed bikashg closed 6 years ago
Yep. It's just the integer representation of the fingerprint:
>>> bin(8550830854347186281)
'0b111011010101010101001110101101110010111110100000110000001101001'
Thanks for the reply. So, the program internally uses the binary stream (for matching) but displays the integer for printing purposes? Also, please help me understand the association between 64 bit binary and 19 digits integer.
Internally, the fingerprints are stored as a uint64_t
- an unsigned 64-bit integer. These integers are compared to one another when identifying near-duplicates (by comparing the number of bits by which they differ). The ~19-digit integer is just the base-10 representation of the fingerprint.
I printed the output of
simhash.compute()
method -- both its type and value. I noticed that the type is integer and value is 19 digit number (eg: 8550830854347186281) . Shouldn't it be a 64 digit fingerprint consisting of only 0s and 1s ?