fhcrc / deenurp

16S rRNA gene sequence curation and phylogenetic reference set creation
GNU General Public License v3.0
4 stars 3 forks source link

Fastphylo to calculate distance matrices for filtering? #6

Open nhoffman opened 10 years ago

nhoffman commented 10 years ago

Might be a faster alternative to fasttree (looks like it requires aligned sequences)

http://www.biomedcentral.com/1471-2105/14/334

Fastphylo: Fast tools for phylogenetics

"We present fastphylo, a software package containing implementations of efficient algorithms for two common problems in phylogenetics: estimating DNA/protein sequence distances and reconstructing a phylogeny from a distance matrix. We compare fastphylo with other neighbor joining based methods and report the results in terms of speed and memory efficiency."

cmccoy commented 10 years ago

I believe that FastTree is sub-quadratic, so fastphylo is likely to be slower if it creates and operates on a distance matrix.

On Mon, Dec 9, 2013 at 3:30 PM, Noah Hoffman notifications@github.comwrote:

Might be a faster alternative to fasttree (looks like it requires aligned sequences)

http://www.biomedcentral.com/1471-2105/14/334

Fastphylo: Fast tools for phylogenetics

"We present fastphylo, a software package containing implementations of efficient algorithms for two common problems in phylogenetics: estimating DNA/protein sequence distances and reconstructing a phylogeny from a distance matrix. We compare fastphylo with other neighbor joining based methods and report the results in terms of speed and memory efficiency."

— Reply to this email directly or view it on GitHubhttps://github.com/fhcrc/deenurp/issues/6 .

Connor McCoy Fred Hutchinson Cancer Research Center 1100 Fairview Ave N. Seattle, WA 98109-1924 cmccoy@fhcrc.org

matsen commented 10 years ago

Yep. It would be useful in some application where you need the distances themselves (e.g. regression) or you are given a distance matrix and want to make a tree. I don't think that's us though.

On Mon, Dec 9, 2013 at 3:51 PM, Connor McCoy notifications@github.comwrote:

I believe that FastTree is sub-quadratic, so fastphylo is likely to be slower if it creates and operates on a distance matrix.

On Mon, Dec 9, 2013 at 3:30 PM, Noah Hoffman notifications@github.comwrote:

Might be a faster alternative to fasttree (looks like it requires aligned sequences)

http://www.biomedcentral.com/1471-2105/14/334

Fastphylo: Fast tools for phylogenetics

"We present fastphylo, a software package containing implementations of efficient algorithms for two common problems in phylogenetics: estimating DNA/protein sequence distances and reconstructing a phylogeny from a distance matrix. We compare fastphylo with other neighbor joining based methods and report the results in terms of speed and memory efficiency."

— Reply to this email directly or view it on GitHub< https://github.com/fhcrc/deenurp/issues/6> .

Connor McCoy Fred Hutchinson Cancer Research Center 1100 Fairview Ave N. Seattle, WA 98109-1924 cmccoy@fhcrc.org

— Reply to this email directly or view it on GitHubhttps://github.com/fhcrc/deenurp/issues/6#issuecomment-30186731 .

Frederick "Erick" Matsen, Assistant Member Fred Hutchinson Cancer Research Center http://matsen.fhcrc.org/

nhoffman commented 10 years ago

FWIW, we do use the distances themselves to calculate centroids and outliers - this is calculated using FastTree -nt -makematrix seqs.fasta. It isn't really clear to me from the docs or help text what, exactly, these distances represent (ie, sum of branch lengths between leaves vs pairwise distances of some sort - log-corrected distances, perhaps?).

cmccoy commented 10 years ago

Ah, my apologies for closing then.