matsengrp / cft

Clonal family tree
5 stars 3 forks source link

Debug slowness on minadcl_clusters.py #219

Closed metasoarous closed 6 years ago

metasoarous commented 6 years ago

It's way slower than it should be for the largest of our unseeded trees. My guess is that ete3 just doesn't handle larger trees so well. Can try dendropy or biopython for this.

metasoarous commented 6 years ago

This script is actually taking hours to run in some cases, really slowing me down in doing the unseeded work, so I'm going to add this to Sprint 3 to hopefully help things flow more quickly.

matsen commented 6 years ago

Yikers. If you want to kick around pseudocode for a one-pass algorithm let me know.

metasoarous commented 6 years ago

Not necessary; dendropy to the rescue here with its PhylogeneticDistanceMatrix. Took down from a few hours to a few minutes. PR pending.