faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
76 stars 48 forks source link

RAxML Output has Many Loci with Only Undetermined Values #227

Closed alexkrohn closed 3 years ago

alexkrohn commented 3 years ago

I'm running through Phyluce to eventually make a tree in RAxML. I have 68 individuals of one (putative) taxon, and one outgroup. According to the output of phyluce_align_get_only_loci_with_min_taxa, I have a final dataset of 73 alignments present in 75% of individuals (n=51). (The paltry number of UCEs is likely due to problems in the enrichment step, but, after excluding the worst individuals and loci I'm trying to use the small amount data that I have to get a preliminary understanding of the phlylogeographic relationships.)

However, when I finally do run RAxML, I see that 1489 columns in my alignment contain only undetermined values which are treated as missing data and which "normally...should be excluded from the analysis." Is there a way in Phyluce to trim those missing sites? Would that involve using a complete matrix instead of an incomplete matrix? Or, should I look to other software to edit the phylip file.

Thanks for your help.

brantfaircloth commented 3 years ago

raxml will do this for you by making a "reduced" file from your input alignment - you don't really need to do much of anything. this is usually the first step of the raxml process (also occurs if using raxml-ng).

alexkrohn commented 3 years ago

Thanks for the super fast response. I saw that they made a list of the reduced sites, but I wasn't sure that they actually excluded them from the analysis. Thanks again!

brantfaircloth commented 3 years ago

you bet 👍