ropensci / software-review

rOpenSci Software Peer Review.
292 stars 104 forks source link

Submission: mcbette #360

Closed richelbilderbeek closed 4 years ago

richelbilderbeek commented 4 years ago

Submitting Author: Richel Bilderbeek (@richelbilderbeek)
Repository: https://github.com/richelbilderbeek/mcbette Version submitted: 1.8.2 Editor: @maelle
Reviewer 1: @vbaliga Reviewer 2: @bjoelle Archive: TBD
Version accepted: TBD


Package: mcbette
Title: Model Comparison Using 'babette'
Version: 1.8.2
Authors@R: c(
    person("Richel J.C.", "Bilderbeek", email = "richel@richelbilderbeek.nl", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-1107-7049")))
Maintainer: Richel J.C. Bilderbeek <richel@richelbilderbeek.nl>
Description: 'BEAST2' (<http://www.beast2.org>) is a widely used
    Bayesian phylogenetic tool, that uses DNA/RNA/protein data
    and many model priors to create a posterior of jointly estimated 
    phylogenies and parameters.
    'mcbette' allows to do a Bayesian model comparison over some
    site and clock models, 
    using 'babette' (<https://www.github.com/ropensci/babette>).
License: GPL-3
LazyData: true
RoxygenNote: 7.0.2
VignetteBuilder: knitr
URL: https://github.com/richelbilderbeek/mcbette
BugReports: https://github.com/richelbilderbeek/mcbette/issues
Depends:
    babette,
    beautier (>= 2.3),
    beastier (>= 2.0.25),
    mauricer,
    tracerer
Imports:
    knitr,
    Rmpfr,
    testit
Suggests:
    ape,
    devtools,
    ggplot2,
    hunspell,
    lintr,
    nLTT,
    phangorn,
    rappdirs,
    rmarkdown,
    spelling,
    testthat (>= 2.1.0)
Language: en-US
Encoding: UTF-8
SystemRequirements: BEAST2 (http://www.beast2.org/)

Scope

mcbette automates a model comparison using the tool 'BEAST2' with one of its packages.

:warning: because mcbette is a wrapper for 'BEAST2' and because 'BEAST2' does not allow to do what mcbette does on Windows, this package does not work under the Windows operating system.

This package aims at scientists in the field of phylogenetics, as it allows to do a Bayesian model comparison of phylogenetic inference models. Or simpler: with mcbette you can pick the best model to build a phylogeny from a DNA sequence.

There is some overlap, as model comparison is an important topic in phylogenetics. Unique about mcbette is the way how it does so: it uses a novel 'BEAST2' package (called 'NS', shorthand of 'Nested Sampling') to directly estimate the marginal likelihood (a.k.a. the evidence) of an inference model. Because the technique of Nested Sampling is -in this field- novel, the 'BEAST2' package is novel and mcbette is novel.

Technical checks

Confirm each of the following by checking the box. This package:

Publication options

JOSS Options - [x] The package has an **obvious research application** according to [JOSS's definition](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). - [x] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. I do intend to submit to JOSS, as it looks awesome! Because I did not expect this (for me, new) option, I am not prepared yet. I will await the acceptance of my submission first, than happily write this! - [ ] The package is deposited in a long-term repository with the DOI: Not yet. - (*Do not submit your package separately to JOSS*)
MEE Options - [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

maelle commented 4 years ago

Editor checks:


Editor comments

Thanks @richelbilderbeek for your submission! I'm now looking for reviewers.

In the meantime, I have a few comments/questions you can address:

goodpractice output

  ✖ not use "Depends" in DESCRIPTION, as it can cause name clashes, and
    poor interaction with other packages. Use "Imports" instead.
  ✖ not import packages as a whole, as this can cause name clashes between
    the imported packages. Instead, import only the specific functions you need.

What are the reasons for importing the whole packages?

  ✖ fix this R CMD check WARNING: LaTeX errors when creating PDF version.
    This typically indicates Rd problems. LaTeX errors found: ! LaTeX Error: File
    `ltxcmds.sty' not found. Type X to quit or <RETURN> to proceed, or enter new
    name. (Default extension: sty) ! Emergency stop. <read *> l.105
    \RequirePackage{ltxcmds}[2010/11/12] ^^M !  ==> Fatal error occurred, no output
    PDF file produced!

Maybe/probably just a setup thing?


Reviewers: @vbaliga @bjoelle Due date: 2020-03-30

maelle commented 4 years ago

You can now add a review badge rodev::use_review_badge(360).

maelle commented 4 years ago

I forgot a question: why are there expectations in examples? Can't these cases & expectations live in tests?

maelle commented 4 years ago

Reviewer 1: @vbaliga (thanks for agreeing to review!). I'll add a deadline once the second reviewer is assigned.

richelbilderbeek commented 4 years ago

Thanks @maelle for the first wave of feedback :+1:

The link to BEAST2 in DESCRIPTION uses http, please replace it with https https://www.beast2.org/ (I made a PR). If you used the http link in other packages, you can change it but that is obviously beyond the scope of my checks. wink

Have you thought about providing guidance/examples on other CI services e.g. GH actions, Circle CI?

I have thought about about it, but I felt it not worth the effort to put another CI in place. Usually I add an AppVeyor build to check on Windows, but because mcbette cannot work on Windows (due to BEAST2), I skipped the AppVeyor build (although I could add a trivial build).

You think I should?

Thanks!

In the FAQ, it says the other packages are not on CRAN, but they're on CRAN now, aren't they?

They are, I will update the FAQ

Why is there no README.Rmd?

Because my other R packages on rOpenSci do not have it either. I'll look into this.

Are your R packages listed somewhere on BEAST2 website?

No. BEAST2 has its own (albeit BEAST2) packages. I think they are aware of babette. You think I should contact them?

I note the JOSS paper.md isn't there yet.

Correct, I made an Issue here. This JOSS item on rOpenSci is new to me, and I think it's cool. Because I am in the lost months of my PhD, though, I do need to be careful with my time. I suggest the deadline of March 2nd February 9th. Would that work?

goodpractice output

✖ not use "Depends" in DESCRIPTION, as it can cause name clashes, and poor interaction with other packages. Use "Imports" instead. ✖ not import packages as a whole, as this can cause name clashes between the imported packages. Instead, import only the specific functions you need.

What are the reasons for importing the whole packages?

There are -really!- dozens of functions that need to be imported. OTOH, goodpractice is right! I fixed it

✖ fix this R CMD check WARNING: LaTeX errors when creating PDF version. This typically indicates Rd problems. LaTeX errors found: ! LaTeX Error: File `ltxcmds.sty' not found. Type X to quit or to proceed, or enter new name. (Default extension: sty) ! Emergency stop. <read *> l.105 \RequirePackage{ltxcmds}[2010/11/12] ^^M ! ==> Fatal error occurred, no output PDF file produced!

Maybe/probably just a setup thing?

Thanks, I have fiixed it

I forgot a question: why are there expectations in examples? Can't these cases & expectations live in tests?

After asking the CRAN maintainers for help, indeed, one should not put tests in examples. I've removed these.

maelle commented 4 years ago

Thank you!

No need to add a trivial Appveyor build. :-)

Yes I think it might make sense to contact BEAST2 developers because their website is probably the first place an user would look for an R package? (second being a search engine, which would probably uncover your packages).

Yes let's put this on hold until 2020/02/09 (@vbaliga, thanks for your understanding, sorry for contacting you too soon). Good luck with your PhD tasks!

richelbilderbeek commented 4 years ago

Thanks @maelle :+1:

maelle commented 4 years ago

Second reviewer @bjoelle (thank you!) - after 2020/02/09

richelbilderbeek commented 4 years ago

@maelle: the JOSS paper is ready for review, but I still need some time (say, another weekend) to fix the other points you've mentioned.

maelle commented 4 years ago

Great, thanks for the update!

richelbilderbeek commented 4 years ago

@maelle: I've worked on some of the points you've mentioned, but have not been able to complete them all yet. Will probably fix before Sunday February 16th 23:59, or at least give another update :+1;

richelbilderbeek commented 4 years ago

Just a quick update: done everything, except that I am struggling with some goodpractice issues.

Will post when having solved those last things. Then -AFAICS- the package is ready for review :+1:

maelle commented 4 years ago

thanks for the update!

richelbilderbeek commented 4 years ago

Just a quick update: nearly there, just waiting for the Travis CI and AppVeyor builds to finish, but this will take until tomorrow (I've queued quite some). I will update when I did the last tiny tidbits :+1:

richelbilderbeek commented 4 years ago

Still to do:

richelbilderbeek commented 4 years ago

As far as I can see, the review is ready to roll :shipit: !

Regarding the goodpractice Issue, I've posted a bug report at the goodpractice repo. I hope this will allow this review to get started :rocket:

maelle commented 4 years ago

Thanks @richelbilderbeek!

@vbaliga @bjoelle I've set the review deadline to 2020-03-30. As a reminder, here's the reviewer guide.

noamross commented 4 years ago

⚠️⚠️⚠️⚠️⚠️

In the interest of reducing load on reviewers and editors as we manage the COVID-19 crisis, rOpenSci is temporarily pausing new submissions for software peer review for 30 days (and possibly longer). Please check back here again after 17 April for updates.

In this period new submissions will not be handled, nor new reviewers assigned. Reviews and responses to reviews will be handled on a 'best effort' basis, but no follow-up reminders will be sent.

Other rOpenSci community activities continue. We express our continued great appreciation for the work of our authors and reviewers. Stay healthy and take care of one other.

The rOpenSci Editorial Board

⚠️⚠️⚠️⚠️⚠️

bjoelle commented 4 years ago

Installing mcbette isn't working for me due to issues with rJava (macOS 10.14.6). The CRAN version of rJava is looking for a specific JDK which isn't the one on my machine. Compiling the package from source (following instructions here zhiyzuo.github.io/installation-rJava/) appeared to work, but running any function crashes the R console.

richelbilderbeek commented 4 years ago

I have observed that getting rJava to work is the hardest nut to crack. mcbette does build on MacOS on Travis CI, but well, that ain't your local computer :+1:

bjoelle commented 4 years ago

@richelbilderbeek I got it working on the cluster on Linux, so I will be able to review mcbette after all

bjoelle commented 4 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [x] A short summary describing the high-level functionality of the software
  • [x] Authors: A list of authors with their affiliations
  • [x] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

Final approval (post-review)

Estimated hours spent reviewing: 3h for installing + 4h for reviewing


Review Comments

This package is well written overall, and makes good use of one of the strengths of babette, which is to generate quickly multiple variations of an analysis. I believe the package installation can be made easier, as detailed below. Otherwise, most of my comments concern the documentation, which I think is too sparse at the moment.

Installation

I had trouble with the package installation due to issues with rJava, and never got it working on my Mac (macOS 10.14.6). The CRAN version of rJava is looking for a specific JDK which isn't the one on my machine. Compiling it from source appeared to work, but running any function crashes the R console. Installing on a Linux cluster worked.

It looks to me as if the dependency on rJava is only used in the get_default_java_path function, which seems disproportionate with the amount of trouble it is. I would recommend exploring whether the dependency can be removed, and if not providing a work-around for people who cannot get rJava working - one option would be to ask users to provide the path to Java manually.

Other comments

Documentation

Minor comments

Functionality

JOSS paper comments

Minor comments

annakrystalli commented 4 years ago

⚠️⚠️⚠️⚠️⚠️ In the interest of reducing load on reviewers and editors as we manage the
COVID-19 crisis, rOpenSci new submissions for software peer review are paused.

In this period new submissions will not be handled, nor new reviewers assigned.
Reviews and responses to reviews will be handled on a 'best effort' basis, but
no follow-up reminders will be sent. Other rOpenSci community activities continue.

Please check back here again after 25 May when we will be announcing plans to slowly start back up.

We express our continued great
appreciation for the work of our authors and reviewers. Stay healthy and take
care of one other.

The rOpenSci Editorial Board ⚠️⚠️⚠️⚠️⚠️

vbaliga commented 4 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [ ] A short summary describing the high-level functionality of the software

Some of this may be a matter of style, but I have a bit of feedback about how this paper is written. Firstly, the introductory material (all paragraphs and figures before "Constructing a phylogeny, however, is non-trivial...") are too basic for my taste -- I don't think evolution itself needs to be described in fundamental detail at all. This explanation is not appropriate for the package's audience -- we know all this already. Rather, I suggest starting with the logic of paragraph 4 ("Constructing a phylogeny, however, is non-trivial...") which states the problem at hand nicely. Although I like the core idea of this paragraph, the author may also consider fleshing out an explanation of why statistical models are used at all.

What I will also more strongly recommend is that the author give the reader a sense of where to start when using the package. Broadly speaking, what is the analytical pipeline one should use with mcbette? I think inclusion of a worked example in the paper would help users know where to start, how to structure their data, which functions to look into, and which vignettes they may want to look into. This would be great to have as either the penultimate or the final paragraph. I don't think it needs to go into specific detail, but just give the reader a general sense of how the package works.

  • [x] Authors: A list of authors with their affiliations

  • [x] A statement of need clearly stating problems the software is designed to solve and its target audience.

  • [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

As noted by the other reviewer, some DOIs are missing and it would be great to have them included

Functionality

Final approval (post-review)

Estimated hours spent reviewing: ~3.5


Review Comments

Firstly, I am quite happy to see that this package exists and was submitted to rOpenSci. mcbette is poised to be very helpful towards biologists who work on tree inference and phylogenetic comparative analyses. By using BEAST and accompanying software, mcbette builds off the author's previous babette and helps expand a user's ability to rely on reproducible pipelines for analyzing genetic data. As someone who will likely use babette and other packages in this ecosystem in the near future, I appreciate what mcbette brings to the table: giving a reproducible approach to model comparisons of sequence evolution.

In running the worked examples as well as feeding in sequences from my own data sets, I did not encounter any issues with how the code ran or the results it produced. To the best of my ability to assess this, mcbette seems to produce correct results.

That being said, testing failed on my machine. Please see above for the relevant lines from the devtools check output. Also in investigating testthat.R, the result of line 6 test_check("mcbette") was:

Error: No tests found for mcbette

Unlike the other reviewer, I did not encounter issues during install (using macOS 10.15.4 with R 3.6.2 and then again on R 4.0.0) so I am sorry to not be able to offer any insight on troubleshooting. I should note that I already had rJava installed, which may help explain the lack of problems.

My inline comments above are geared towards enhancing the clarity of the documentation. I will agree with the other reviewer's assessment -- the documentation and the accompanying paper are a little too sparse as presently written. I'd appreciate seeing expansions of both the high-level functionality (in a broad sense) as well as more detailed descriptions of how the author envisions users feeding data into analytical pipelines. I don't mind having similar information repeated in the readme, description, vignettes, and paper; doing so helps not only reinforce the message of the package and also helps users' understanding regardless of their point of entry into the package.

I agree with all the specific bullet points the other reviewer raised regarding the Documentation and Functionality sections and for the sake of brevity won't repeat them here. I'll add that providing a full(er) list of potential models would be great to have, perhaps as its own vignette.

I do not necessarily agree with the other reviewer's comments regarding the JOSS paper but this is primarily because I feel the paper should be restructured (please see inline comments above).

Thanks again for the opportunity to review this package. Again, my suggestions are motivated by wanting to enhance the clarity of presentation and make it easier for potential users to see the value of relying on mcbette. It's nice work!

🐢

richelbilderbeek commented 4 years ago

Dear @bjoelle and @vbaliga, thanks so much for the detailed reviews and the time you put into this!

@maelle: I assume I can get now started :rocket:. I'll regularily post updates here :+1:

richelbilderbeek commented 4 years ago

Processing feedback here.

maelle commented 4 years ago

Thanks @richelbilderbeek, yes!

Thanks @vbaliga & @bjoelle for your reviews! :smile_cat:

richelbilderbeek commented 4 years ago

Progress has been slow, a.o. due to illness. Will work on mcbette Friday July 10th.

maelle commented 4 years ago

@richelbilderbeek I'm sorry to read that, I hope you're feeling better.

richelbilderbeek commented 4 years ago

Going through the feedback, goes fine, but will not finish today.

richelbilderbeek commented 4 years ago

Nearly done processing all feedback. As expected: the package already got so much better :+1:

maelle commented 4 years ago

Thanks for the update! 😀

maelle commented 4 years ago

:wave: @richelbilderbeek, any update?

richelbilderbeek commented 4 years ago

Will put in some more work on Friday/Saturday/Sunday :rainbow:

richelbilderbeek commented 4 years ago

Made some progress today :+1:

richelbilderbeek commented 4 years ago

:partying_face:

I have processed all the feedback, see my feedback here: https://github.com/richelbilderbeek/mcbette/issues/30.

There is a (reported) problem with the Travis CI service, hence the build does not pass there. It does pass locally.

maelle commented 4 years ago

@richelbilderbeek thank you! could you please post your response in this thread?

richelbilderbeek commented 4 years ago

The first reviewer's comments:

Installation

I had trouble with the package installation due to issues with rJava, and never got it working on my Mac (macOS 10.14.6). The CRAN version of rJava is looking for a specific JDK which isn't the one on my machine. Compiling it from source appeared to work, but running any function crashes the R console. Installing on a Linux cluster worked.

It looks to me as if the dependency on rJava is only used in the get_default_java_path function, which seems disproportionate with the amount of trouble it is. I would recommend exploring whether the dependency can be removed, and if not providing a work-around for people who cannot get rJava working - one option would be to ask users to provide the path to Java manually.

I completely agree that getting rJava to work is the toughest nut to crack! I've seen this is true for Linux, Mac and Windows. Yet, beastier (a dependency of mcbette) needs it to -indeed- get the default Java path as well getting the Java version, which helps out in processing bug reports (created by beastier::beastier_report).

However, I feel hand-coding the same subset of functionality is not the way to go. rJava is the package to work with Java and I hope (and predict) the rJava maintainers to simplify its installation. I feel the responsibility about get Java info is at rJava, instead of mcbette: mcbette is there for model comparison.

Note that the rJava maintainer strongly recommends upgrading to rJava 0.9-12 at the 2020/03/23 rJava release update: maybe that update fixed exactly the problem Joelle had.

Other comments

instructions on how to configure to use an existing BEAST2 installation would be helpful, if it is possible.

I have documented using BEAST2 installed at a custom location with this commit.

On Mac mcbette is not looking for it in the default place it is installed to, so that could lead to many duplicate installations.

This would indeed be weird! I looked into this, by writing a test that installs BEAST2 at a custom location, after which mcbette is called to use that BEAST2 install (see this commit). The test, however, passes. If this weird behavior could be reproduced, that would definitely be hulpful, but up until then, I cannot fix a problem I cannot reproduce.

Documentation

the documentation (paper, README, vignette) was generally unclear as to which models can be selected using NS - the only mention I found was in the DESCRIPTION, saying that some substitution and clock models are available. It would be useful to include this in the README along with a list of available models. One example showing clock model comparison, in the vignette for instance, would also be useful.

I agree info on this was thin. I've created a vignette in beautier that showcases the inference model options. In README, vignette and paper, I added the available site, clock and tree models.

in general, the documentation assumes that people using mcbette will already be familiar with the other packages of the suite, particularly beautier. I think this expectation needs to be stated much more clearly in the README and vignette, and that a link to an appropriate introduction vignette should be provided.

I've described the relation to the beautier package in more places and link to a vignette in beautier that showcases the inference model options.

the example in the README should describe the tested model briefly (JC69, strict clock, etc).

I added this.

there is a create_ns_inference_model function in the package, but the examples (README, vignette, package doc) use the beautier::create_test_ns_inference_model function instead - it's unclear what the difference is between the two, or which one should be used for a real analysis. If the test function is only used to set up a shorter analysis (as the README seems to imply ?), then it would be better to set the shorter analysis with the real function and modified parameters, and explain what is being done.

I moved create_ns_inference_model to the beautier package and made the documentation of the create_[something]_inference_model functions link to each other.

Additionally, I followed your suggestion and replaced:

x <- create_ns_test_inference_model()

by

x <- create_ns_inference_model()
x$mcmc <- create_test_ns_mcmc()

in the default parameter documentation, beast2_options should link to the create_mcbette_beast2_options function used in all the examples, instead of the beastier::create_beast2_options function. Similarly inference_model should link to either create_ns_inference_model or beautier::create_test_ns_inference_model rather than beautier::create_inference_model.

I made these link to one another. Note that the inference_model functions have been moves to beautier, and the beast2_options functions to beastier

when autocompleting in RStudio, the description shown on the function is the beginning of the Description section. This means that for instance check_mcbette_state shows only Will stop otherwise.. Same issue with check_beast2_ns_pkg, interpret_bayes_factor.

This is a neat observation, that I was unaware of. I have fixed this.

contrary to what the doc says, est_marg_lik does not return either estimates or trees.

Well spotted! I've updated the doc in this commit.

the error message when est_marg_lik fails prints the inference model nicely, but not the BEAST2 options (it shows beast2_1c6e311fc10d3.xml.stateNANAFALSETRUE).

Hmmm, I wish I could reproduce that! Because if I look into the code, I do not treat the inference model different than the BEAST2 options:

  tryCatch({
      bbt_run_out <- babette::bbt_run_from_model(
        fasta_filename = fasta_filename,
        inference_model = inference_model,
        beast2_options = beast2_options
      )
      ns <- bbt_run_out$ns
    },
    error = function(e) {
      stop(
        "Could not estimate the marginal likelihood. \n",
        "Error message: ", e$message, "\n",
        "fasta_filename: ", fasta_filename, "\n",
        "beast2_options: ", beast2_options, "\n",
        "inference_model: ", inference_model, "\n"
      )
    }

I would enjoy fixing this, but if I cannot check my results, I cannot see if I have fixed it. Could you reproduce the bug? If yes, I will happily add it!

it would be useful to have an example in the interpret_bayes_factor doc of how to use that function with the output from est_marg_liks.

I agreed, up until I added interpret_marg_lik_estimates, which interprets the results of est_marg_liks (https://github.com/richelbilderbeek/mcbette/issues/34), which -IMHO- makes the interpret_bayes_factor redundant. I hope you agree!

Minor comments

the doc for est_marg_lik links to itself, same with est_marg_liks

Well spotted! It now links correctly to est_marg_liks

* `est_marg_lik.R`, l16 "Actual" -> "Current"

Renamed.

Functionality

the main feature I think is missing from the package at the moment is the ability to keep the log and tree output of the babette runs, in addition to the marginal likelihood. The documentation says that the trees are returned, but that does not appear to be the case. I would expect that users doing model selection will also be interested in the parameter estimates for the best model.

To the mcbette main documentation (?mcbette) and mcbette::est_marg_liks I've added how to obtain the parameter estimates, posterior trees and other things stored in temporary files.

my other suggestion is to have a function similar to interpret_bayes_factor which could directly process the output from est_marg_liks, and print the table with the best model highlighted and how all the others compare to that one.

Great idea, I did so by adding interpret_bayes_factor_log_lik (https://github.com/richelbilderbeek/mcbette/issues/33). It even comes with a text plot!

JOSS paper comments

* two references are missing DOIs (required according to JOSS guidelines)

I added DOIs to all journal articles.

* there is a `> Example text below` left in l18

Blimey, well spotted! I've put the Summary there :+1:

* the single quotes in the title (`'babette'`) give me a yaml parser error when building the file

I removed these quotes altogether. I've added a Travis CI test to verify that knitr::knit does work on the file.

* l59-61 I think you need to clarify that you are only talking about living species here rather than all species. It's also unclear to me why you say we "should be able to reconstruct" rather than "can reconstruct".

Agreed, I've used to wording you suggested.

* l61-62 I would introduce phylogenies earlier, when showing the primates tree.

I rewrote most of the paper, after which phylogenies are introduced sooner.

* l70-71 I think this sentence is too vague. I would suggest replacing it with "However, constructing a phylogeny is non-trivial, as it requires the researcher to choose how to model the evolutionary process, including..."

Agreed, I've used your more precise wording. Thanks!

* l80-85 This section needs to mention that the 'best' model is always defined in regards to a particular dataset. It's explained later but needs to be made clear earlier.

I have reworded in a such a way to avoid using the word 'best', and it includes that it is dependent on the alignment/dataset:

mcbette is an R package that determines the model with the most evidence (aka marginal likelihood) for having generated the alignment, from a set of models.

* l85 I don't think you can reuse this definition of 'best' for phylogenies - what would it mean for a phylogeny to be "simple" ?

Agreed, I removed it. Thanks!

Minor comments

* l49 then -> than

Fixed. Thanks!

* l63 a the -> the

Fixed. Thanks!

Did so.

richelbilderbeek commented 4 years ago

Second reviewer's reply:

Documentation

The package includes all the following forms of documentation:

[ ] A statement of need clearly stating problems the software is designed to solve and its target audience in README A statement is present, but I do not think it states the problem clearly enough. What is a "Model Comparison"? The phrase is too generic to give a user a sense of what mcbette intends to solve. Although someone familiar with BEAST can likely guess this for him/herself, it should not be up to the reader to interpret. I don't think the author needs to go into a whole lot of detail, but I'd suggest adding 2-3 more sentences to elaborate on what the point of fitting the models is and why the models are being compared at the top of the readme where the package is introduced. There's a few sentences in the paper.md that achieve this nicely, so I suggest just incorporating them here as well. I'd also rephrase "solve a common problem" in the Example section to give an explanation of what the problem in the example is and why it is being solved. Again, I anticipate that a user of this package will likely know how all this works. Rather, the point here is to make it more immediately clear what mcbette does when a user first encounters the package.

Indeed, it was a bit concise. I have rephrased this as such:

mcbette allows for doing a Model Comparison using babette (hence the name), that is, given a (say, DNA) alignment, it compares multiple phylogenetic inference models to find the model that is likeliest to generate that alignment. With this, one can find the phylogenetic inference model that is simple enough, but not too simple.

To do so, mcbette uses babette [Bilderbeek and Etienne, 2018] with the addition of using the BEAST2 [Bouckaert et al., 2019] nested sampling package as described in [Russell et al., 2019].

[x] Installation instructions: for the development version of package and any non-standard dependencies in README In contrast to the other reviewer, I did not encounter any issues with installation or running code (macOS 10.15.4).

Great!

[x] Vignette(s) demonstrating major functionality that runs successfully locally Great work! Nice explanation of a clear example. I'll suggest linking this on the readme too as it is very helpful for understanding what the package does and how it works.

Great!

[x] Function Documentation: for all exported functions in R help

Happy to read I did a good job :+1:

[x] Examples for all exported functions in R Help that run successfully locally

Happy to read I did a good job :+1:

[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING, and DESCRIPTION with URL, BugReports and Maintainer (which may be autogenerated via Authors@R).

Happy to read I did a good job :+1:

For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

[ ] A short summary describing the high-level functionality of the software

Indeed, somehow this short summary is missing! I've added it :+1:

Some of this may be a matter of style, but I have a bit of feedback about how this paper is written. Firstly, the introductory material (all paragraphs and figures before "Constructing a phylogeny, however, is non-trivial...") are too basic for my taste -- I don't think evolution itself needs to be described in fundamental detail at all. This explanation is not appropriate for the package's audience -- we know all this already. Rather, I suggest starting with the logic of paragraph 4 ("Constructing a phylogeny, however, is non-trivial...") which states the problem at hand nicely. Although I like the core idea of this paragraph, the author may also consider fleshing out an explanation of why statistical models are used at all.

Thanks to your feedback, I have removed the whole -indeed slow- introduction and started at the point you suggested. With some rewriting, I feel I have introduced the need for statistical models enough (by introducing the term 'evidence', aka marginal likelihood) and I hope you agree.

What I will also more strongly recommend is that the author give the reader a sense of where to start when using the package. Broadly speaking, what is the analytical pipeline one should use with mcbette? I think inclusion of a worked example in the paper would help users know where to start, how to structure their data, which functions to look into, and which vignettes they may want to look into. This would be great to have as either the penultimate or the final paragraph. I don't think it needs to go into specific detail, but just give the reader a general sense of how the package works.

Agreed! I have added a section 'Getting Started' that should introduce the reader to the field and direct him/her to the literature.

  • [x] Authors: A list of authors with their affiliations
  • [x] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software). As noted by the other reviewer, some DOIs are missing and it would be great to have them included

Agreed, thanks! I have added these!

Functionality

[x] Installation: Installation succeeds as documented. Nice touch with the can_run_mcbette() function! Very handy!

Thanks! If you like that one, you'll love mcbette_report :sunglasses: :+1:

  • [x] Functionality: Any functional claims of the software been confirmed.

Cool!

  • [x] Performance: Any performance claims of the software been confirmed.

Awesome.

  • [ ] Automated tests: Unit tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.

I am unsure why, but testing failed on my machine. Here are the relevant lines from the devtools check:

── Test failures ───────────────────────────────────────────────── testthat ────

library(testthat) library(mcbette) before <- get_mcbette_state() test_check("mcbette") ── 1. Error: Package style (@test-style.R#2) ─────────────────────────── invalid 'path' argument Backtrace:

  1. lintr::expect_lint_free()
  2. lintr::lint_package(...)
  3. base::normalizePath(path, mustWork = FALSE)
  4. base::path.expand(path)

══ testthat results ════════════════════════════════════════════════════ [ OK: 88 | SKIPPED: 5 | WARNINGS: 0 | FAILED: 1 ]

  1. Error: Package style (@test-style.R#2)

Error: testthat unit tests failed Execution halted 1 error x | 0 warnings ✓ | 1 note x Error: R CMD check found ERRORs Execution halted Exited with status 1.

I have the same error. This was not the case upon the submission of mcbette. A bug report has already been filed here in October 2019. I've removed the tests, as it is tested by Travis CI anyways.

Final approval (post-review)

  • [ ] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Let's see what this reviewer thinks about this :+1:

Estimated hours spent reviewing: ~3.5

* [x]  Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.

Done, including the ORCID ID :+1:

Review Comments

Firstly, I am quite happy to see that this package exists and was submitted to rOpenSci. mcbette is poised to be very helpful towards biologists who work on tree inference and phylogenetic comparative analyses. By using BEAST and accompanying software, mcbette builds off the author's previous babette and helps expand a user's ability to rely on reproducible pipelines for analyzing genetic data. As someone who will likely use babette and other packages in this ecosystem in the near future, I appreciate what mcbette brings to the table: giving a reproducible approach to model comparisons of sequence evolution.

In running the worked examples as well as feeding in sequences from my own data sets, I did not encounter any issues with how the code ran or the results it produced. To the best of my ability to assess this, mcbette seems to produce correct results.

Happy to hear this!

That being said, testing failed on my machine. Please see above for the relevant lines from the devtools check output. Also in investigating testthat.R, the result of line 6 test_check("mcbette") was:

Error: No tests found for mcbette

As noted above, the devtools issues are caused by something upstream (in the lintr package) and needs to be resolved there.

Regarding the second error, I can perfectly reproduce this. I predict this is a feature, not a bug. With CTRL+SHIFT+T in R Studio or devtools::test() the tests do start as expected. Also, mcbette is tested by Travis CI (for Linux and Mac) and AppVeyor (for Windows) and its code coverage is measured from these tests. Therefore, I assume the tests works and that the No tests found for mcbette message is some R feature I and you misinterpret.

Unlike the other reviewer, I did not encounter issues during install (using macOS 10.15.4 with R 3.6.2 and then again on R 4.0.0) so I am sorry to not be able to offer any insight on troubleshooting. I should note that I already had rJava installed, which may help explain the lack of problems.

I am happy to have this confirmed :+1:

My inline comments above are geared towards enhancing the clarity of the documentation. I will agree with the other reviewer's assessment -- the documentation and the accompanying paper are a little too sparse as presently written. I'd appreciate seeing expansions of both the high-level functionality (in a broad sense) as well as more detailed descriptions of how the author envisions users feeding data into analytical pipelines. I don't mind having similar information repeated in the readme, description, vignettes, and paper; doing so helps not only reinforce the message of the package and also helps users' understanding regardless of their point of entry into the package.

I added quite some documentation, due to feedback from the other reviewer indeed. I hope this will also satisfy your needs.

I agree with all the specific bullet points the other reviewer raised regarding the Documentation and Functionality sections and for the sake of brevity won't repeat them here. I'll add that providing a full(er) list of potential models would be great to have, perhaps as its own vignette.

I do not necessarily agree with the other reviewer's comments regarding the JOSS paper but this is primarily because I feel the paper should be restructured (please see inline comments above).

I have restructured the paper from a mix of both reviews. I hope you both like it. I, for sure, do!

Thanks again for the opportunity to review this package. Again, my suggestions are motivated by wanting to enhance the clarity of presentation and make it easier for potential users to see the value of relying on mcbette. It's nice work!

turtle

Thanks so much for reviewing!

richelbilderbeek commented 4 years ago

@richelbilderbeek thank you! could you please post your response in this thread?

@maelle: Here goes! A bit clumsy that I forgot :blush: ....

maelle commented 4 years ago

Thank you, no problem!

@vbaliga @bjoelle are you satisfied with @richelbilderbeek's response? If so would you mind checking the box " The author has responded to my review and made changes to my satisfaction. I recommend approving this package." in your review and if not would you be able to add some comments in this thread? In any case many thanks already and again for your reviews.

vbaliga commented 4 years ago

Rather than edit my previous review, I am electing to write a fresh one here:

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [x] A short summary describing the high-level functionality of the software

  • [x] Authors: A list of authors with their affiliations

  • [x] A statement of need clearly stating problems the software is designed to solve and its target audience.

Very nicely done!

  • [x] References: with DOIs for all those that have one (e.g. papers, datasets, software).

As noted by the other reviewer, some DOIs are missing and it would be great to have them included

Functionality

Final approval (post-review)

Estimated hours spent reviewing: <1 for this round


Review Comments

Thanks very much for taking my feedback into account. This package was already very good and I think it is even better now in the sense that it is more approachable for new users.

I have a couple minor points to bring up, but please note that despite these points I am satisfied with the changes I already see and I am supportive of this work's publication.

During devtools::check(), I encountered:

> checking R code for possible problems ... NOTE
  interpret_marg_lik_estimates: no visible global function definition for
    ‘na.omit’
  Undefined global functions or variables:
    na.omit
  Consider adding
    importFrom("stats", "na.omit")
  to your NAMESPACE file.

The other point is that I think it would be a nice touch to link to the vignettes at the end of paper.md. The paper does a nice job giving an overview of the package and it would be great to end it by giving users a suggestion of what they might want to read next, i.e. the vignettes.

Thanks again to @maelle for the opportunity to review this and to @richelbilderbeek for taking my feedback. I am looking forward to using mcbette and Richel's other packages in the near future.

Best regards, Vikram

🐢

richelbilderbeek commented 4 years ago

@vbaliga is right: I will process his feedback in this weekend I already processed his feedback :+1:

bjoelle commented 4 years ago

Sorry it took me so long to come back to this review ! Overall I am happy with the changes that have been made to the package and documentation. My one comment would be that I feel like the current version of the JOSS paper underplays the usefulness of the package (the statement of need in particular feels a bit weak). I would state more clearly what the added value of mcbette is (compared to using BEAST2 directly) : easy set-up and execution of several parallel analyses using different models, easy analysis and nice printing of the results of the model selection, possibility to integrate it into an existing R pipeline, etc.

maelle commented 4 years ago

Thank you @bjoelle! The JOSS comments are very important. @richelbilderbeek note that JOSS recently changed their scope so please read the scope again before submitting https://joss.theoj.org/about#submitting

Approved! Thanks @richelbilderbeek for submitting and @vbaliga @bjoelle for your reviews! :smile_cat: To-dos:

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this.

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.

We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

richelbilderbeek commented 4 years ago

Great! Will do so ASAP, probably next weekend :+1:

maelle commented 4 years ago

Ok, thanks for the update! :pray:

richelbilderbeek commented 4 years ago

Thank you @bjoelle! The JOSS comments are very important. @richelbilderbeek note that JOSS recently changed their scope so please read the scope again before submitting https://joss.theoj.org/about#submitting

I re-read the scope. AFAICS, the article is still within the scope.

To-dos:

* [x]  Transfer the repo to rOpenSci's "ropensci" GitHub organization under "Settings" in your repo.  I have invited you to a team that should allow you to do so.  You'll be made admin once you do.

Done :+1:

* [x]  Fix all links to the GitHub repo to point to the repo under the ropensci organization.

I did a 'replace all' using a regex searching for all files with richelbilderbeek/mcbette and replaced it with ropensci/mcbette

* [x]  Delete your current code of conduct file if you had one since rOpenSci's default one will apply, see https://devguide.ropensci.org/collaboration.html#coc-file

Done.

* [x]  If you already had a `pkgdown` website **and are ok relying only on [rOpenSci central docs building and branding](https://devguide.ropensci.org/ci.html#even-more-ci-ropensci-docs)**, [...]

Not applicable

* [x]  Fix any links in badges for CI and coverage to point to the ropensci URL. We no longer transfer Appveyor projects to ropensci Appveyor account so after transfer of your repo to rOpenSci's "ropensci" GitHub organization the badge should be `[![AppVeyor Build Status](https://ci.appveyor.com/api/projects/status/github/ropensci/pkgname?branch=master&svg=true)](https://ci.appveyor.com/project/individualaccount/pkgname)`. If Appveyor does not pick up new commits after transfer, you might need to delete and re-create the Appveyor project. (Repo transfers are smoother with Travis CI and GitHub Actions)

Done :+1:

* [x]  We're starting to roll out software metadata files to all ropensci packages via the Codemeta initiative, see https://github.com/ropensci/codemetar/#codemetar for how to include it in your package, after installing the package - should be easy as running codemetar::write_codemeta() in the root of your package.

Done :+1:

* [ ]  Activate Zenodo watching the repo

Unsure if I did so correctly. I did create a DOI for the current version, 10.5281/zenodo.4076183

* [x]  Tag and create a release so as to create a Zenodo version and DOI

Done, DOI is 10.5281/zenodo.4076183

* [x]  Submit to JOSS at https://joss.theoj.org/papers/new, using the rOpenSci GitHub repo URL. When a JOSS "PRE REVIEW" issue is generated for your paper, add the comment: `This package has been reviewed by rOpenSci: https://LINK.TO/THE/REVIEW/ISSUE, @ropensci/editors`

Done so, at https://github.com/openjournals/joss-reviews/issues/2738

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this.

Done :+1:

Welcome aboard! We'd love to host a post about your package - either a short introduction to it with an example for a technical audience or a longer post with some narrative about its development or something you learned, and an example of its use for a broader readership. If you are interested, consult the blog guide, and tag @stefaniebutland in your reply. She will get in touch about timing and can answer any questions.

Hi @stefaniebutland, I would enjoy to write a blog post about mcbette. I would enjoy it to work together again :+1:

We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

I did not need the book this time...

stefaniebutland commented 4 years ago

Congratulations @richelbilderbeek on another package passing peer review! Do you think this one would be suitable as a tech note, rather than longer narrative blog post?

Our Blog Guide has details to help you decide, and provides technical details for submitting your post. I think we published this after your babette post.

Please suggest a date by which you would like to submit a draft via pull request and I will reply with a publication date. My colleague Steffi LaZerte will review.

maelle commented 4 years ago

Thanks @richelbilderbeek !