Closed 473021677 closed 1 year ago
The numbers in question are estimates of the posterior mean number of events and as a result can be larger than unity.
E.g. if there is a 50% posterior probability of 0 Ds on a branch and 50% probability of 1 D one gets 0.5, but if there is 50% probability of 1 D and 50% of 2 Ds then one has 1.5.
I.e. the table at the end of the .uml_rec file summarises the number events per branch in the species tree from the reconciled gene trees above it (the long list of Newick strings). These reconciled gene trees are sampled according to their joint sequence and reconciliation likelihood (cf. eq. 3 here https://academic.oup.com/sysbio/article/62/6/901/1711882 ) in general they can have different topologies and reconciliations (i.e. series of DTL events).
In the example I attach (which I got by running ../build/bin/ALEml_undated Sab.tree Gab.tree.ale delta=0.2 tau=0.1 i.e. by forcing a higher probability of duplication for the purposes of this toy example ) in the list of reconciled gene trees you can see several alternative reconciled gene trees, e.g.
ones with 3 duplications on branch 3 of the species tree, these are the branches on the reconciled gene tree with @.***”:
@.**@*.**@*.***:1).4:0;
and also ones with only one duplication on branch 3 of the species tree @." again) and three transfers @.>3”, @.>3” and @.>c")
@.**@*.**@*.**@.>c:1).3:0;
etc.
the the estimate of the posterior mean number of D events on branch 3 of S is the average over these:
.. S_internal_branch 3 1.75 0.43 0 0.47 3.05 0.3 9.31127e-07 0.93 -10.8095 ..
On 31 Oct 2022, at 03:51, 袁洋 @.***> wrote:
Dear Gergely Thanks for your explanations. I still can't understand what the 1 D and 2 Ds mean. I guess that 1 D and 2 Ds mean that a gene family can be duplicated once and twice on the branch of the species phylogeny, respectively. If it is true, how I should deal with these duplication events with inferred frequencies greater than 1? And If the inferred frequency is 1.9, I am not sure whether one or two duplication event should be counted.
Best regards, Yang Yuan
------------------ Original ------------------ From: "Gergely @.>; Date: Sun, Oct 30, 2022 05:31 PM To: @.>; Cc: @.***>; Subject: Re: Inferred frequencies greater than 1 (ALEml_undated in ALE 1.0 package)
Dear Yang Yuan,
The numbers in question are estimates of the posterior mean number of events and as a result can be larger than unity.
E.g. if ther is a 50% posterior probality of 0 Ds on a branch and 50% probability of 1 D one gets 0.5, but if there is 50% probality of 1 D and 50% of 2 Ds then one has 1.5.
Hope this answers the question.
Gergely
On 2022. Oct 29., at 16:55, 袁洋 @.***> wrote:
Hi, I am using ALEml_undated in ALE 1.0 package to infer the evolutionary history. No errors have been reported by the program. When I check the result files, I find that few inferred frequencies of duplications, transfers, losses, and originations are greater than 1. I am not sure if there's something wrong with it. I will use the threshold of 0.3 in the raw reconciliation frequencies to avoid misses of true events. Could I count these duplications, transfers, losses, and originations events with inferred frequencies greater than 1? I really need your help. If you could help, I really appreciate. I have appended the result files. Thanks very much.
Best regards, YangYuan
On 29 Oct 2022, at 17:00, 473021677 @.***> wrote:
Hi, I am using ALEml_undated in ALE 1.0 package to infer the evolutionary history. No errors have been reported by the program. When I check the result files, I find that few inferred frequencies of duplications, transfers, losses, and originations are greater than 1. I am not sure if there's something wrong with it. I will use the threshold of 0.3 in the raw reconciliation frequencies to avoid misses of true events. Could I count these duplications, transfers, losses, and originations events with inferred frequencies greater than 1? I really need your help. Thanks very much.
Best regards, YangYuan
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.
Thanks for your explanation. I have got your idea.
Best regards, Yang Yuan
---原始邮件--- @.> 发送时间:2022年11月1日(星期二) 晚上6:27 @.>; 主题: [ssolo/ALE] Inferred frequencies greater than 1 (Issue #38)
T ------------------ Original ------------------ From: "Gergely J @.> Date: Tue, Nov 1, 2022 06:27 PM @.>; @.**@.>; Subject: Re: [ssolo/ALE] Inferred frequencies greater than 1 (Issue #38)
Dear Gergely I have another question. I have 11 gene tree files (OG0000000.muscle.trimal.phy_renamed.treefile, OG0000001.muscle.trimal.phy_renamed.treefile, ..., OG0000010.muscle.trimal.phy_renamed.treefile) and 1 rooted species tree (Species_rooted_tree_newick_renamed.txt), and placed them into the same folder. Then, I run the ALE program using the commands like "ALEobserve OG0000000.muscle.trimal.phy_renamed.treefile" and ALEml_undated Species_rooted_tree_newick_renamed.txt OG0000000.muscle.trimal.phyrenamed.treefile.ale sample=100 separators="". For the four gene tree files including OG0000000.muscle.trimal.phy_renamed.treefile, OG0000001.muscle.trimal.phy_renamed.treefile, OG0000002.muscle.trimal.phy_renamed.treefile and OG0000003.muscle.trimal.phy_renamed.treefile, the errors "ALEml_undated using ALE v1.0 Read species tree from: Species_rooted_tree_newick_renamed.txt.. Error, file test/OG0000001.muscle.trimal.phy_renamed.treefile.ale does not seem accessible." have been reported by the program. However, for the other 7 gene tree files, no errors have been reported. I can't resolve this problem and need your help. I have appended the files. Thanks.
Best regards, Yang Yuan ------------------ Original ------------------ From: @.>; Date: Tue, Nov 1, 2022 06:27 PM To: @.>; Cc: @.>; @.>; Subject: Re: [ssolo/ALE] Inferred frequencies greater than 1 (Issue #38)
The numbers in question are estimates of the posterior mean number of events and as a result can be larger than unity.
E.g. if there is a 50% posterior probability of 0 Ds on a branch and 50% probability of 1 D one gets 0.5, but if there is 50% probability of 1 D and 50% of 2 Ds then one has 1.5.
I.e. the table at the end of the .uml_rec file summarises the number events per branch in the species tree from the reconciled gene trees above it (the long list of Newick strings). These reconciled gene trees are sampled according to their joint sequence and reconciliation likelihood (cf. eq. 3 here https://academic.oup.com/sysbio/article/62/6/901/1711882 ) in general they can have different topologies and reconciliations (i.e. series of DTL events).
In the example I attach (which I got by running ../build/bin/ALEml_undated Sab.tree Gab.tree.ale delta=0.2 tau=0.1 i.e. by forcing a higher probability of duplication for the purposes of this toy example ) in the list of reconciled gene trees you can see several alternative reconciled gene trees, e.g.
ones with 3 duplications on branch 3 of the species tree, these are the branches on the reconciled gene tree with @.***”:
@.**@*.**@*.***:1).4:0;
and also ones with only one duplication on branch 3 of the species tree @." again) and three transfers @.>3”, @.>3” and @.>c")
@.**@*.**@*.**@.>c:1).3:0;
etc.
the the estimate of the posterior mean number of D events on branch 3 of S is the average over these:
.. S_internal_branch 3 1.75 0.43 0 0.47 3.05 0.3 9.31127e-07 0.93 -10.8095 ..
> On 31 Oct 2022, at 03:51, 袁洋 @.> wrote: > > Dear Gergely > Thanks for your explanations. I still can't understand what the 1 D and 2 Ds mean. I guess that 1 D and 2 Ds mean that a gene family can be duplicated once and twice on the branch of the species phylogeny, respectively. If it is true, how I should deal with these duplication events with inferred frequencies greater than 1? And If the inferred frequency is 1.9, I am not sure whether one or two duplication event should be counted. > > Best regards, > Yang Yuan > > > ------------------ Original ------------------ > From: "Gergely @.>; > Date: Sun, Oct 30, 2022 05:31 PM > To: @.>; > Cc: @.>; > Subject: Re: Inferred frequencies greater than 1 (ALEml_undated in ALE 1.0 package) > > > Dear Yang Yuan, > > The numbers in question are estimates of the posterior mean number of events and as a result can be larger than unity. > > E.g. if ther is a 50% posterior probality of 0 Ds on a branch and 50% probability of 1 D one gets 0.5, but if there is 50% probality of 1 D and 50% of 2 Ds then one has 1.5. > > Hope this answers the question. > > Gergely >> On 2022. Oct 29., at 16:55, 袁洋 @.***> wrote: >> >> >> Hi, >> I am using ALEml_undated in ALE 1.0 package to infer the evolutionary history. No errors have been reported by the program. When I check the result files, I find that few inferred frequencies of duplications, transfers, losses, and originations are greater than 1. I am not sure if there's something wrong with it. I will use the threshold of 0.3 in the raw reconciliation frequencies to avoid misses of true events. Could I count these duplications, transfers, losses, and originations events with inferred frequencies greater than 1? I really need your help. If you could help, I really appreciate. I have appended the result files. Thanks very much. >> >> Best regards, >> YangYuan >>
> On 29 Oct 2022, at 17:00, 473021677 @.***> wrote: > > > Hi, > I am using ALEml_undated in ALE 1.0 package to infer the evolutionary history. No errors have been reported by the program. When I check the result files, I find that few inferred frequencies of duplications, transfers, losses, and originations are greater than 1. I am not sure if there's something wrong with it. I will use the threshold of 0.3 in the raw reconciliation frequencies to avoid misses of true events. Could I count these duplications, transfers, losses, and originations events with inferred frequencies greater than 1? I really need your help. Thanks very much. > > Best regards, > YangYuan > > — > Reply to this email directly, view it on GitHub, or unsubscribe. > You are receiving this because you are subscribed to this thread. >
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Hi, I am using ALEml_undated in ALE 1.0 package to infer the evolutionary history. No errors have been reported by the program. When I check the result files, I find that few inferred frequencies of duplications, transfers, losses, and originations are greater than 1. I am not sure if there's something wrong with it. I will use the threshold of 0.3 in the raw reconciliation frequencies to avoid misses of true events. Could I count these duplications, transfers, losses, and originations events with inferred frequencies greater than 1? I really need your help. Thanks very much.
Best regards, YangYuan