JLSteenwyk / orthosnap

a tree splitting and pruning algorithm for retrieving single-copy orthologs from gene family trees
https://jlsteenwyk.com/orthosnap/
MIT License
23 stars 1 forks source link

Is it neceaasay to use unaligned fasta files? #1

Closed Mia1349 closed 2 years ago

Mia1349 commented 2 years ago

Hi Jacob!

Great tool! I been having some issue while filting the results from orthofinder, and your tool have been a big help! I performed several test runs and it certainly helps me find more gene markers.

I have one question though, on the website you specified that to use unaligned orthologous group of sequences as input file. However, after the alignment and triming (use trimal), some of the taxa were removed from the alignment, therefore the number of taxa in the final tree will somehow be inconsistent with the orginal output OG*.fa file, and the run will fail. Can I use the alignment file as inputs? I assume we need to align them again afterwards?

While writing this, I noticed that one of my testruns is moving really slow. It is a large tree with almost 10,000 tips, is that the remaing timeat the end of the bar?

Many thanks!

Mia

1650284653(1)
JLSteenwyk commented 2 years ago

Hi Mia,

My apologies for responding so late. Please feel to message me on Twitter if I don't respond earlier.

Firstly, thank you very much for your interest in OrthoSNAP. I have developed other software that may be a utility to you -- see here https://jlsteenwyk.com/software.html.

To answer your question, no you do not need to input an unaligned sequence. OrthoSNAP will automatically determine which sequence is longest by removing gap sequences (represented as '-') prior to sequence length determination.

Yes, the bar reflects progress in your iteration.

All the best,

Jacob