matsen / pplacer

Phylogenetic placement and downstream analysis
http://matsen.fredhutch.org/pplacer/
GNU General Public License v3.0
75 stars 18 forks source link

pplacer does not recognize reference sequences in alignment #140

Closed cmccoy closed 13 years ago

cmccoy commented 13 years ago

Using hmmalign with --mapali to create an alignment with reference and query sequences doesn't perform as expected:

# Align·
hmmalign --mapali gag.sto gag.hmm gag_query.fasta > gag_query_aln.sto

This doesn't work:

pplacer -t gag.tre -s gag.stats gag_query_aln.sto·
# Running pplacer v1.1.alpha09 (git v1.1.alpha08-399-gda6b4b7) analysis on gag_query_aln.sto...
# Fatal error: exception Failure("Please specify a reference alignment with -r or -c, or include all reference sequences in the primary alignment.")

Splitting the alignment up fixes the problem:

seqmagick convert --pattern-include '^(?!gag_\d)' gag_query_aln.sto gag_query_aln_ref.sto
seqmagick convert --pattern-include '^gag_\d' gag_query_aln.sto gag_query_aln_qry.sto

# OK
pplacer -t gag.tre -s gag.stats -r gag_query_aln_ref.sto gag_query_aln_qry.sto

Reproducing files and a script to do all this are in $MATSENGRP/working/cmccoy/pplacer-refs-in-align

cmccoy commented 13 years ago

@habnabit: Thanks!