Open rbouckaert opened 10 years ago
Great idea! As a general rule, such modifications to data/model should be transparent to the user and made with user’s consent. So a dialog box listing the gene trees to be modified and which species the dummy sequences will be added for, with (Okay) and (Cancel) options would be good.
On 15/07/2014, at 10:12 am, rbouckaert notifications@github.com wrote:
If not all gene trees have sequences for all species, for each such species a single empty sequence could be added to the alignment, and thus the gene tree, to create a valid *BEAST analysis.
— Reply to this email directly or view it on GitHub.
Not sure whether this should be a BEAUti option or a BEAST option.
What pleads for BEAUti is that it can get consent from the user, but if the taxonset is changed afterwards, the dummy sequences are still lingering. This means we should have a mechanism for removing dummy sequences as well.
On the other hand, adding sequences in BEAST means we have to reinitialise the alignment and corresponding tree as well, which requires a bit more administration.
Remco
On Mon, 2014-07-14 at 15:20 -0700, Alexei Drummond wrote:
Great idea! As a general rule, such modifications to data/model should be transparent to the user and made with user’s consent. So a dialog box listing the gene trees to be modified and which species the dummy sequences will be added for, with (Okay) and (Cancel) options would be good.
On 15/07/2014, at 10:12 am, rbouckaert notifications@github.com wrote:
If not all gene trees have sequences for all species, for each such species a single empty sequence could be added to the alignment, and thus the gene tree, to create a valid *BEAST analysis.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub.
I would advocate strongly for BEAUti. BEAST should never change anything about model/data.
The BEAST input XML is the definitive description of the analysis and it should be totally clear from reading the XML exactly what the analysis will be.
BEAST should do nothing but follow the instructions in that XML :)
BEAUti on the other hand is a tool to help construct sensible input XMLs.
An alternative solution to this problem would be to change the *BEAST implementation to directly handle gene trees that are missing some species. This is of course technically achievable, but when I spoke to Joseph about it he said it would involve some large changes to the implementation. It sounds like it would be a serious piece of research and would undoubtedly require substantial tests and simulations to verify.
On 15/07/2014, at 10:27 am, rbouckaert notifications@github.com wrote:
Not sure whether this should be a BEAUti option or a BEAST option.
What pleads for BEAUti is that it can get consent from the user, but if the taxonset is changed afterwards, the dummy sequences are still lingering. This means we should have a mechanism for removing dummy sequences as well.
On the other hand, adding sequences in BEAST means we have to reinitialise the alignment and corresponding tree as well, which requires a bit more administration.
Remco
On Mon, 2014-07-14 at 15:20 -0700, Alexei Drummond wrote:
Great idea! As a general rule, such modifications to data/model should be transparent to the user and made with user’s consent. So a dialog box listing the gene trees to be modified and which species the dummy sequences will be added for, with (Okay) and (Cancel) options would be good.
On 15/07/2014, at 10:12 am, rbouckaert notifications@github.com wrote:
If not all gene trees have sequences for all species, for each such species a single empty sequence could be added to the alignment, and thus the gene tree, to create a valid *BEAST analysis.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub.
I see where you are coming from: alignments are data, thus should be treated as sacred.
However, there is a bit of a grey area here: BEAST does initialise a number of state nodes, for example the tree when initialised as random-tree, rate indicators in relaxed clocks, etc. The StarBeastStartState even sets values for birth rate and pop sizes. This means that data as specified in the XML can be changed by BEAST.
Setting a flag in BEAUti that says "add dummy sequences, if required" seems a viable option to me, and to some extent follows what you says "BEAST should do nothing but follow the instructions in that XML" since the XML will tell BEAST to add dummy sequences.
What worries me is that letting BEAUti do this may not be as robust as letting BEAST sort out the dummy sequences.
Perhaps the addition of dummy sequences should only happen when saving the BEAST specification in BEAUti?
On Mon, 2014-07-14 at 15:45 -0700, Alexei Drummond wrote:
I would advocate strongly for BEAUti. BEAST should never change anything about model/data.
The BEAST input XML is the definitive description of the analysis and it should be totally clear from reading the XML exactly what the analysis will be.
BEAST should do nothing but follow the instructions in that XML :)
BEAUti on the other hand is a tool to help construct sensible input XMLs.
An alternative solution to this problem would be to change the *BEAST implementation to directly handle gene trees that are missing some species. This is of course technically achievable, but when I spoke to Joseph about it he said it would involve some large changes to the implementation. It sounds like it would be a serious piece of research and would undoubtedly require substantial tests and simulations to verify.
On 15/07/2014, at 10:27 am, rbouckaert notifications@github.com wrote:
Not sure whether this should be a BEAUti option or a BEAST option.
What pleads for BEAUti is that it can get consent from the user, but if the taxonset is changed afterwards, the dummy sequences are still lingering. This means we should have a mechanism for removing dummy sequences as well.
On the other hand, adding sequences in BEAST means we have to reinitialise the alignment and corresponding tree as well, which requires a bit more administration.
Remco
On Mon, 2014-07-14 at 15:20 -0700, Alexei Drummond wrote:
Great idea! As a general rule, such modifications to data/model should be transparent to the user and made with user’s consent. So a dialog box listing the gene trees to be modified and which species the dummy sequences will be added for, with (Okay) and (Cancel) options would be good.
On 15/07/2014, at 10:12 am, rbouckaert notifications@github.com wrote:
If not all gene trees have sequences for all species, for each such species a single empty sequence could be added to the alignment, and thus the gene tree, to create a valid *BEAST analysis.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub.
Perhaps this would be sufficient:
(1) an explicit option in the XML and, (2) an informative message in the BEAST standard output detailing exactly what dummy sequences were added in
In general, maybe we need a mechanism to allow "Are you sure?" type popups (with a specific detail message) in BEAUti that can be configured to be triggered when checkboxes are selected or deselected?
Then when, for example, the user checks the "add dummy sequences when necessary" box in BEAUti a dialog box with some details of what this entails and why it is done can be displayed.
On 15/07/2014, at 11:01 am, rbouckaert notifications@github.com wrote:
I see where you are coming from: alignments are data, thus should be treated as sacred.
However, there is a bit of a grey area here: BEAST does initialise a number of state nodes, for example the tree when initialised as random-tree, rate indicators in relaxed clocks, etc. The StarBeastStartState even sets values for birth rate and pop sizes. This means that data as specified in the XML can be changed by BEAST.
Setting a flag in BEAUti that says "add dummy sequences, if required" seems a viable option to me, and to some extent follows what you says "BEAST should do nothing but follow the instructions in that XML" since the XML will tell BEAST to add dummy sequences.
What worries me is that letting BEAUti do this may not be as robust as letting BEAST sort out the dummy sequences.
Perhaps the addition of dummy sequences should only happen when saving the BEAST specification in BEAUti?
On Mon, 2014-07-14 at 15:45 -0700, Alexei Drummond wrote:
I would advocate strongly for BEAUti. BEAST should never change anything about model/data.
The BEAST input XML is the definitive description of the analysis and it should be totally clear from reading the XML exactly what the analysis will be.
BEAST should do nothing but follow the instructions in that XML :)
BEAUti on the other hand is a tool to help construct sensible input XMLs.
An alternative solution to this problem would be to change the *BEAST implementation to directly handle gene trees that are missing some species. This is of course technically achievable, but when I spoke to Joseph about it he said it would involve some large changes to the implementation. It sounds like it would be a serious piece of research and would undoubtedly require substantial tests and simulations to verify.
On 15/07/2014, at 10:27 am, rbouckaert notifications@github.com wrote:
Not sure whether this should be a BEAUti option or a BEAST option.
What pleads for BEAUti is that it can get consent from the user, but if the taxonset is changed afterwards, the dummy sequences are still lingering. This means we should have a mechanism for removing dummy sequences as well.
On the other hand, adding sequences in BEAST means we have to reinitialise the alignment and corresponding tree as well, which requires a bit more administration.
Remco
On Mon, 2014-07-14 at 15:20 -0700, Alexei Drummond wrote:
Great idea! As a general rule, such modifications to data/model should be transparent to the user and made with user’s consent. So a dialog box listing the gene trees to be modified and which species the dummy sequences will be added for, with (Okay) and (Cancel) options would be good.
On 15/07/2014, at 10:12 am, rbouckaert notifications@github.com wrote:
If not all gene trees have sequences for all species, for each such species a single empty sequence could be added to the alignment, and thus the gene tree, to create a valid *BEAST analysis.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub.
If not all gene trees have sequences for all species, for each such species a single empty sequence could be added to the alignment, and thus the gene tree, to create a valid *BEAST analysis.