Question for Matt: Can efficiency of outputting alignments be improved?

m-orton / Evolutionary-Rates-Analysis-Pipeline

The purpose of this repository is to develop software pipelines in R that can perform large scale phylogenetics comparisons of various taxa found on the Barcode of Life Database (BOLD) API.

GNU General Public License v3.0

7 stars 1 forks source link

Question for Matt: Can efficiency of outputting alignments be improved? #17

Closed sadamowi closed 7 years ago

sadamowi commented 7 years ago

Hi Matt,

Thank you for adding the code for outputting key interim alignments and the final trimmed alignments. Those lines of code have been very helpful for exploring the performance of the analysis.

This is not an urgent issue, and I think these extra steps are fine overall for these smaller phyla (for which we potentially expect more serious alignment problems). However, I'm wondering if there is a straightforward way to have the alignments exported without having to rerun them?

If there isn't a straightforward way to combine those steps, then don't worry about this. Thank you.

Cheers, Sally

m-orton commented 7 years ago

Hi Sally, you are right, there is actually an easier way of outputting to Fasta format which I just figured out: For example the first preliminary alignment could be:

alignmentPrelima <- DNAStringSet(alignment2[[1]]) writeXStringSet(alignmentPrelima, file="alignmentPrelima.fas", format="fasta", width=1500)

This allows us to avoid redoing alignments. Im going to change each alignment with this code instead.

Best Regards, Matt

m-orton commented 7 years ago

Script has been edited in the Annelida branch to output to Fasta each alignment of each class without redoing alignments.

Best Regards, Matt

sadamowi commented 7 years ago

OK great. I think that would be a good idea for these alignments to be saved as part of the routine running of the script. Thank you very much for addressing this.

Best wishes, Sally