Closed leke-lyu closed 1 year ago
Hi!
Yes, the reference is excluded, if you want to include it in the analysis you need to add it as an entry in the alignment file (since the sequence would be identical to the reference, the corresponding entry in the maple file would just be empty).
Also, MAPLE does not assume that the reference is the root. Instead, when using --model UNREST (which I would recommend) it will infer the maximum likelihood rooting of the tree based on the non-stationary substitution model UNREST and assuming that the nucleotide frequencies in the reference genome are very close to the root nucleotide frequencies (which should be the case at the levels of low divergence for which MAPLE is relevant). When using option --model UNREST the output tree will already be rooted. Note however that there are different ways to root a tree, and there is no guarantee that the maximum likelihood rooting based on a non-stationary substitution model will be the correct one: one might want to consider other information like outgroups and sampling times, which however cannot be included in MAPLE analyses yet.
When using a stationary substitution model (like GTR or JC) MAPLE will still output a rooted tree, but the location of the root in the tree will be meaningless.
In case you are interested in rooting the tree to make a specific sample the root, this is not implemented yet in MAPLE, but I might add the option in the future if you are interested.
Thank you for your response! My concern regarding rooting comes from an issue with my current dataset of SARS-COV-2. With over 26,000 delta samples collected within a 6-month period, the dataset has a weak time signal. Using TreeTime to reroot, it inferred a tMRCA of 2018, which seems off. This is why I'm considering rooting my tree with the reference strain (Wuhan/Hu-1/2019). By doing so and then fixing the topology and the clock rate at 0.0008, I hope to achieve a more accurate time-calibrated tree. At the moment, I believe the most appropriate approach is to include the reference strain (Wuhan/Hu-1/2019) in my MAPLE file. Additionally, I'll use the --model flag set to 'UNREST'. What do you think?
Including the reference and using the UNREST model I think makes sense. The root should be inferred by MAPLE to be on the branch between the reference and the other samples. Let me know if there are any issues!
Hi Nicola,
I noticed that the resulting tree file excludes the reference. Is the resulting tree rooted?