richelbilderbeek / pirouette_article

Article about pirouette, by Bilderbeek, Laudanno and Etienne
GNU General Public License v3.0
0 stars 0 forks source link

Process feedback reviewers #52

Closed richelbilderbeek closed 4 years ago

richelbilderbeek commented 5 years ago

Copy-paste of email below:

Fwd: Methods in Ecology and Evolution - Decision on Manuscript ID MEE-19-08-613

29 October 2019 11:15 42 KB From: Giovanni Laudanno To: Rampal S. Etienne, Richel Bilderbeek

I just received a reply from MEE.

---------- Forwarded message --------- Da: Methods in Ecology and Evolution onbehalfof@manuscriptcentral.com Date: mar 29 ott 2019 alle ore 11:07 Subject: Methods in Ecology and Evolution - Decision on Manuscript ID MEE-19-08-613 To: [...]

29-Oct-2019

MEE-19-08-613 Quantifying the importance of an inference model in Bayesian phylogenetics

Dear Mr Giovanni Laudanno,

I have now received the reviewers' reports and a recommendation from the Associate Editor who handled the review process. Copies of their reports are included below. Based on their evaluations, I regret to inform you that we are unable to publish your paper in Methods in Ecology and Evolution in its current form.

However, we would be willing to consider a new manuscript which takes into consideration the feedback you have received.

I think that this ms is "in between" an Application ms and a full Research ms. While the idea can be generally applied to phylogenetic inference, your piroutte R package is associated with BEAST2. I would suggest either writing it as a short Applications ms (3000 words) if you stick with the latter (and leave some programming details to Supplementary Information or vignette in R or github); OR expanding it to a full Research ms but being more general and thorough with exploring the interpretations and limitations of the approach (see reviewer 1 comments).

Please note that resubmitting your manuscript does not guarantee eventual acceptance, and that your resubmission may be subject to re-review before a decision is rendered. Please also ensure that your altered manuscript still conforms to our word limit of 6000-7000 for research articles, or 3000 for applications.

Once you have made the suggested changes, go to https://mc.manuscriptcentral.com/mee-besjournals and login to your Author Centre. Click on "Manuscripts with Decisions," and then click on "Create a Resubmission" located next to the manuscript number. Then, follow the steps for resubmitting your manuscript.

Because we are trying to facilitate timely publication of manuscripts submitted to Methods in Ecology and Evolution, your new manuscript should be uploaded within 12 weeks. The deadline for your resubmission is 27-Jan-2020. If it is not possible for you to submit your manuscript by that date, please get in touch with the editorial office, otherwise we will consider your paper as a completely new submission.

I look forward to your resubmission.

Sincerely,

Dr Lee Hsiang Liow Senior Editor, Methods in Ecology and Evolution

Reply to: Mr Chris Grieves Methods in Ecology and Evolution Editorial Office coordinator@methodsinecologyandevolution.org

Associate Editor Comments to Author: Associate Editor

Comments to the Author:

This is an interesting paper which introduces an R package design to test whether new tree prior are “relevant” enough to justify the effort to implement such new species tree prior. As said by one of the reviewers “Having a tool available to automate the workflow instead of having to cobble together some scripts to do this is a good idea”, but there are some aspects that need a bit more clarification and perhaps additional work to justify the publication of the method. I found the paper to be well written (but see reviewer’s comments for some interesting ideas on how to improve the paper flow), and the work potentially relevant, but as mentioned above I agree with the reviewers that there are a some clarifications/adjustments that need to be made before the we can properly evaluate the paper.

Apart from what was mentioned by both reviewers I would add (some purely cosmetic) the following:

Reviewer(s)' Comments to Author:

Reviewer: 1

Comments to the Corresponding Author

In this paper, Bilderbeek and co-authors present “pirouette”, a tool implemented in R for evaluating tree inference error due to tree prior misspecification. The main task of pirouette is to quantitatively evaluate the amount of tree inference error when data is generated under a “new” tree model, but analyzed under another (true-and-tested) model – the ultimate goal is to determine whether the new tree model is “worth the effort and computational burden to implement […] in a Bayesian framework”. Pirouette works by first simulating data (sequence alignments) along a tree generated under the new model, given a set of known nucleotide evolution and clock models. Then it determines the baseline error (due to the stochasticity of MCMC) through a process the authors dubbed “twinning”, to which the “true” error (under a hand-picked or best-fit tree model) is compared. Throughout the paper error is measured with the nLTT statistic, though other statistics can be used.

The paper is overall well written, with simple worked examples that are easy to follow and in logical order. However, as is, this article reads as a very pleasant tutorial that ultimately fails to justify the usefulness of the pirouette tool. Here are some points in which this study and its discussion could be improved.

    1. As an initial remark, I imagine the models pirouette support have all been implemented in tested in other packages like geiger, ape, etc.? It might be interesting to list where the tree models are coming from, maybe merging Table 1 with section 6. If such models were implemented from scratch, there must evidencef that they're working as intended.
  1. Tree models are often used as priors in phylogenetic analyses whose main goal is to infer species tree topologies and divergence times. However, we are often interested in estimating the model parameters as they represent relevant biological quantities. For example, in the “SSE” family of models that the authors refer to in the introduction, there are specific diversification parameters that can teach us about how speciation takes place (e.g., in a trait-, or geographic-dependent manner). So regardless of the measured tree inference error when the tree prior is misspecified, it might still be worth implementing a new tree model. I would go even further and say that while in turns out that tree priors end up being commonly used in species tree inference, it is commonly the case that they are invented and implemented because researchers are interested in learning other evolutionary aspects from their favorite clades or species.

  2. Let us say we run pirouette and observe something similar to Fig. 6. There is clearly an increase in tree inference error, but how much is “enough” so I can actually determine whether or not I want to implement my new model? The authors themselves highlight an important point in the introduction “(…) when the data are very informative, as this will reduce the influence of the tree prior”. Presumably, if I have a lot of data, then I should expect a small different between my twinning and true error to be alarming? Or did I misunderstand? And if this is correct, how much is a lot of data? How about hyper-prior misspecification? Does it matter? Should one worry about tree topology errors at all? None of these things are quantitatively (or qualitatively) addressed through the worked example and in the discussion. Bottom line is that I effectively would not know how to interpret my error distribution after using this tool, even after observing a difference between “twin” and “true”.

  3. Please correct me if I am wrong, but to run pirouette, I must have a working simulator of my new model. Is this not a quarter or a third of the way toward having an implementation of the tree model already? Arguably it is easier to implement simulators then likelihoods, but they often share the same engines (e.g., in order to simulate/stochastically map characters under an SSE model, one uses very similar ODE’s – see Freyman & Hohna 2018 for an example). Does this requirement of pirouette not defeat the tool’s purpose? But maybe I misunderstand.

    1. One thing that got me confused. In the last section, listing 13 sets the “twin_model” to “birth_death”, but then in both top and bottom panels of Fig. 6 we see either “Yule” or “CEP”. In the text, it says the tree came from a Yule-like process, which is why I assume the top panel has “Yule” in the legend. Is this a typo? I also do not understand why the top panel legends read “Generative”. If the purpose of pirouette is to evaluate whether or not I want to implement a new tree model, then it doesn’t make sense to already have it implemented the inferential part in R. When I read “generative” I immediately start to think that the true generative tree model was used in inference, but I think what the authors mean here is everything else in the whole model stayed the same, but Yule was used as the tree prior. Am I correct in assuming this? If so, I would re-label this graph. In fact, this section could use a bit of rewriting to clarify this point regardless of my correct understanding.
  1. My final point that would require some more work is in fact noted by the authors themselves, namely that “one tree is not enough to determine the impact of a tree prior on Bayesian inference”. Indeed, I would like to see not only the error distribution better characterized in terms of their relative magnitudes (as mentioned in my second point above), but also in terms of distributions of simulated trees. It is fine to have simple worked examples around a single tree – as they provide a clear axis along which to explain the workings of a program – but in a real-world scenario, one would probably have a distribution of trees in hand. What if the worked example in this paper revolves around a tree that is at the “tail” of the tree distribution produced by the new tree model? Perhaps if another tree was picked, then the twin and true error distributions could look more alike? Or be further apart?

Reviewer: 2

Comments to the Corresponding Author Pirouette is a great tool for judging whether new models are new enough to be worth the effort of implementing in phylogenetic packages. Having a tool available to automate the workflow instead of having to cobble together some scripts to do this is a good idea.

The manuscript can benefit from a bit of reorganisation, since initially it was not clear to me that this is the main function of pirouette -- I confused it with posterior predictive analysis, which seems to be closely related. Perhaps the confusion stems from the fact that section 2 and especially 2.2 dive into the practicalities before explaining the theory and motivation for someone wanting to do so. If that can be remedied it might prevent such confusion.

    1. line 20-22: "An open question is, how accurate the tree estimation is when the real macroevolutionary processes are substantially different from those assumed in the tree prior." which can be answered using tree model adequacy (TMA package for BEAST 2). Some context to clarify how this differs from posterior predictive analysis would be good here.
    1. In general, the difference between the pirouette approach and TMA/posterior predictive method should be explained more clearly, since it mostly seems to be the starting tree being from a different source, and in tree statistics used. The twin phylogeny approach is a nice addition to posterior predictive analysis.
    1. line 53-62 please break up sentence -- this one is really hard to follow.
    1. line 82ff: "Also recently, Duchene et al. [Duchene et al. 2018] released a BEAST2 package to assess how well posterior predictive simulations recover a given tree when using the standard diversification models. These studies show how current diversification models compare to one another, but they do not help to assess the importance of a new tree prior." This misrepresents the work of Duchene et al, which aims to demonstrate that a tree prior is adequate (the package is called TMA = tree model adequacy), which is pretty much the aim of this paper.
    1. Table 1 why is order of abbreviation in legend different from the order of rows in the table?
    1. Figure 1 "The twin alignment has the same number of mutations as the original alignment." and line 188ff: why keep the number of mutations constant? With the same root height and same mutation rate, there should be some natural variation in the number of mutations. Fixing these feels like this could cause unexpected biases, e.g., reduce the variance in the error measure for the twin tree analysis.
    1. line 142 "nucleotide substitution model, which we will refer to as site models".
  1. Site models include things like gamma rate heterogeneity and proportion invariable sites, so calling a substitution model a site model does not seem to be appropriate.

  2. Are gamma rate heterogeneity and proportion invariable sites supported?

    1. line 146 where does the set of inference models come from?
    1. If the inference model used in generating data differs from that inferring the tree, are you really testing the adequacy of the original model, or doing integration testing of the whole process? Using the same site and clock model used to generate the alignment seems to be the natural thing to do, since you are interested in the tree model, not the clock or site models.
    1. How to determine an appropriate length of the sequence? If the sequence is long enough, the site model model for inference will be the same as that for generating the alignment. Also, for sufficiently long sequences, the tree prior won't matter (as mentioned in the paper), so I suppose the sequences should not be too informative. Some discussion around these issues would be useful.
    1. The height of the tree in units of substitutions will be a factor: when the tree is small (<< 0.1 substitutions) sequences will have many constant sites, and the tree cannot be reliably recovered. When the tree is large (>>1 substitutions), there will be saturations and it will be impossible to recover the tree. In the example, a distance of 1 is used (tree height = 10, mutation rate = 0.1), but not explained why that is a good combination. Some discussion around this would be useful.
    1. Isn't it more natural to define priors on parameters of the site model (in the spirit of https://github.com/rbouckaert/DeveloperManual/) instead of fixing them in the set I_1,...,I_N?
    1. line 174 The nLTT statistic is agnostic about taxa labels, as opposed to for example Robinson Foulds distance, and has only been demonstrated to be useful in an ABC setting. Please explain why this is such a good statistic instead of say tree length, gamma statistic, treeness, or any of the other tree metrics in the TreeStat2 package for BEAST 2.
    1. Line 176 Instead of describing the mechanism for generating a twin tree, starting with motivation for why one wants to get involved with a twin tree would be good. This now only starts at line 186 and further.
    1. line 568: capitals are missing from the references "bayesian" "nltt", etc. You might want to check other references as well.

Out of curiosity:

    1. Have you identified tree models that are sufficiently different to be worth implementing in BEAST?
    1. Why was this implemented as R package instead of a BEAST package (similar to the TMA package) so it can benefit from all models available in BEAST, and not just a limited subset?
richelbilderbeek commented 5 years ago

Copy-paste of feedback Rampal:

Re: Fwd: Methods in Ecology and Evolution - Decision on Manuscript ID MEE-19-08-613

29 October 2019 11:41 49 KB From: Rampal S. Etienne To: Giovanni Laudanno, Richel Bilderbeek

[...]

I think these are useful reviews. I tend towards rewriting it as a Short Applications ms. Alternatively, we can consider submitting it to the Software section of Systematic Biology. Either way, we will have to revise the ms (you can't send it unaltered to a different journal), but the changes for Syst Biol will be less, I think. Then again, we have no guarantee that the ms will be accepted there, whereas we have decent chance now in MEE, if we revise well.

Let me know what you think. In any case, I would postpone this a bit as you have a thesis to complete. But here are some thoughts I had when reading the reviews:

Regarding an example of a "known" "non-standard" tree prior, I think the diversity-dependent model is a good choice.

Regarding the error, I think we do leave it up to the individual researcher to decide whether it is large or not.

Regarding having the simulation model is already half-way implementation in BEAST2, I would say: not in all cases, because the likelihood calculation may be very time-consuming, so actual implementation requires that a lot of effort should be put in optimizing this. Pirouette will show whether it's worth the effort.

When you do start revising, please first write the response letter, discuss it amongst ourselves, and only after we all agree implement changes in the ms (and perhaps pirouette).

richelbilderbeek commented 4 years ago

Done!