Closed althonos closed 3 years ago
... I literally read #48 five minutes after opening this issue :sweat_smile:
Haha yeah I just swapped it out for the iterative version but the disjoint set idea might still be better. I'd probably still welcome a PR for this, with your above code and the added dependency in setup.py
Hi again @gamcil ,
I used
clinker
to align 60 clusters, each of them sharing between 1 and 4 homolog protein. This causedclinker.align.consolidate
to throw aRecursionError
, as it seemed to have some issues merging everything.As you pointed in the documentation, you used the Rosetta stone recursive implementation; there is an iterative implementation that would likely fix that issue. However, there is an algorithmic data structure that would work better for the kind of task that
GlobalAligner.build_gene_groups
is trying to achieve: a disjoint-set.If you are fine adding an external pure-Python dependency (
disjoint-set
), here is the replacement code: