seomoz / simhash-py

Simhash and near-duplicate detection
MIT License
406 stars 115 forks source link

Fixed the hashes() function so it actually returns stored hashes, as … #15

Closed eyvindn closed 9 years ago

eyvindn commented 9 years ago

Table 0 isn't guaranteed to be in the original permutation (we only permute the first x elements, the remaining ones are just appended and can be in any order), so hashes() will return the permuted versions of the hashes, which isn't very useful. This will also let it pass test.py if you have the fixed iterator in simhash-cpp.