bomeara / SpiculeComplexity

GNU General Public License v3.0
0 stars 0 forks source link

Use Dohrmann et al. 2017 as basis for the alignment; align our other regions to that, toss things that don't align #2

Open bomeara opened 1 year ago

bomeara commented 1 year ago

https://link-springer-com.utk.idm.oclc.org/article/10.1186/s12983-017-0191-3

bomeara commented 1 year ago

Also use mt dna and tree from https://doi.org/10.1016/j.ympev.2020.107011

bomeara commented 1 year ago

Alvizu, Adriana, et al. "Increased taxon sampling provides new insights into the phylogeny and evolution of the subclass Calcaronea (Porifera, Calcarea)." Organisms Diversity & Evolution 18.3 (2018): 279-290.

bomeara commented 1 year ago

Workflow:

SpiculeComplexity/PhylogenyFromOthers/DohrmannEtAl2017 has their output files, in nexus format. Some might have issues (the secondary structure alignments are good but hard for ape to parse). Manually clean out these on copies, load into r with ape::read.nexus.data, export as fasta NON-INTERLEAVED.

Take all the 28s, COI, etc. sequences from SpiculeComplexity/Phylogeny/seqs_raw, concatenate by gene name (18s, 16s, etc.)

then do:

mafft --inputorder --ep 0.0 --auto --nuc --add unaligned.fasta --keeplength 28s.seq > all_aligned.fasta

where unaligned.fasta is the concatenation of the seqs_raw for this gene, and 28s.seq (or whatever) is the aligned seqs from Dohrmann et al. . Note that keeplength will delete parts of sequences that don't align: see https://mafft.cbrc.jp/alignment/software/addsequences.html for help.