amkozlov / raxml-ng

RAxML Next Generation: faster, easier-to-use and more flexible
GNU Affero General Public License v3.0
374 stars 62 forks source link

Duplications don't cluster after ancestral tree reconstruction #184

Closed ceren-yildirim closed 1 month ago

ceren-yildirim commented 1 month ago

Hello,

I am using RAxML-NG to generate ancestral trees for my exon sequences derived from eukaryotic genomes. My workflow is as follows:

raxml-ng --msa $MSA_FILE --data-type DNA \
    --seed 7 --threads $thread_num1 --prefix $1 \
    --model GTR{./notr_model.txt}+R4 --tree rand{20} pars{20}

raxml-ng --ancestral --msa $MSA_FILE --data-type DNA \
    --seed 7 --threads $thread_num1 --prefix  $1 --model GTR{./notr_model.txt}+R4 \
    --tree ./${1}.bestTree_unrooted

I have approximately 100-200 sequences in my MSA, each around 200 nucleotides in length. However, I have encountered an issue:

Some sequences are duplicated in my dataset. In the .raxml.bestTree, these duplicates are correctly clustered together. However, in the raxml.ancestralTree, these duplicates appear on different branches.

From the best tree: image

From the ancestral tree: image

I am unsure why this discrepancy occurs. Is this expected behavior, or could there be an issue with my model or the tree-building process? I would appreciate any insights or suggestions on how to address this problem.

Additionally, I would like to improve the accuracy of my trees, especially given the relatively short length and small number of sequences. I increased the start trees and set --lh-epsilon 0.01. Could you suggest any additional parameters or strategies to further enhance the accuracy of my phylogenetic trees?

Thank you so much!

amkozlov commented 1 month ago

Hi Ceren,

IIRC --ancestral does not really modify the tree topology, so what you observe must be either a rooting/visualization artifact, or the result of the intermediate step you used to obtain ${1}.bestTree_unrooted from ${1}.bestTree.

So please double check, and if you still think that the unexpected change in topology is coming from the --ancestral command, please upload all relevant input and output files, and I will try to reproduce.

ceren-yildirim commented 1 month ago

Hi again,

Thank you very much for your time and assistance.

I realized that the intermediate step was problematic, even though it initially seemed fine. Your help made me aware of this issue.

Thanks again for your support.

amkozlov commented 1 month ago

Perfect, you're welcome!

stamatak commented 1 month ago

For visualization issues connected to rooted trees, please also have a look at:

https://pubmed.ncbi.nlm.nih.gov/28369572/

Alexis

On 23.05.24 16:04, Oleksiy Kozlov wrote:

Hi Ceren,

IIRC |--ancestral| does not really modify the tree topology, so what you observe must be either a rooting/visualization artifact, or the result of the intermediate step you used to obtain |${1}.bestTree_unrooted| from |${1}.bestTree|.

So please double check, and if you still think that the unexpected change in topology is coming from the |--ancestral| command, please upload all relevant input and output files, and I will try to reproduce.

— Reply to this email directly, view it on GitHub https://github.com/amkozlov/raxml-ng/issues/184#issuecomment-2127056677, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6VYUDNMXYAFZ5ZSVLLZDXSPDAVCNFSM6AAAAABIDXDSRCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRXGA2TMNRXG4. You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Alexandros (Alexis) Stamatakis

ERA Chair, Institute of Computer Science, Foundation for Research and Technology - Hellas Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.biocomp.gr (Crete lab) www.exelixis-lab.org (Heidelberg lab)

ceren-yildirim commented 1 month ago

Thank you!