go2starr / lshhdc

LSH based high dimensional clustering for sets and points
MIT License
79 stars 42 forks source link

Bug in unionfind.sets() #3

Closed cjauvin closed 10 years ago

cjauvin commented 10 years ago

Sorry to flood this repo, but I think I found a bug in the unionfind.sets() function. It's easy to reproduce:

uf = UnionFind()
uf.union(0, 1)
uf.union(2, 3)
uf.union(3, 0)
assert uf.sets() == [[0, 1, 2, 3]] # rather gives [[0], [1, 2, 3]]

The bug is not in the original code, it's in the added sets() function of this repo, which incorrectly assumes that the set trees are always flat. Here's my fix:

https://github.com/cjauvin/lshhdc/commit/1fd5d44acad36fe5eda46625d942badec2c452a6

@escherba, @cjdd3b and @shreyu86, if you confirm that this is a bug, you'll have to update your forked repos. Ideally I would issue a PR for this base repo, but @go2starr doesn't seem very responsive.

escherba commented 10 years ago

Confirmed. Amazing work, @cjauvin.

go2starr commented 10 years ago

Hey, thanks for the reports @cjauvin! Hadn't taken a look at this repo in quite some time, but just saw the emails. I simply wrote this code for exercise while reading the MMD book, so there are sure to be issues.

Anyway, I've pulled in your two changes.