openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
703 stars 36 forks source link

[REVIEW]: Mashtree: a rapid comparison of whole genome sequence files #1762

Closed whedon closed 4 years ago

whedon commented 4 years ago

Submitting author: @lskatz (Lee Katz) Repository: https://github.com/lskatz/mashtree Version: v1.0 Editor: @csoneson Reviewers: @luizirber, @standage Archive: 10.5281/zenodo.3568386

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/ad5ba5ffe250613f4266cf4d5c4c867b"><img src="https://joss.theoj.org/papers/ad5ba5ffe250613f4266cf4d5c4c867b/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/ad5ba5ffe250613f4266cf4d5c4c867b/status.svg)](https://joss.theoj.org/papers/ad5ba5ffe250613f4266cf4d5c4c867b)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@luizirber & @peterwc, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @csoneson know.

Please try and complete your review in the next two weeks

Review checklist for @luizirber

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @standage

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

whedon commented 4 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @luizirber, @peterwc it looks like you're currently assigned to review this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
csoneson commented 4 years ago

@luizirber, @peterwc - this is where the review happens. Instructions are available above, don't hesitate to ping me if you have questions!

whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

csoneson commented 4 years ago

@lskatz - please have a look at the checklists above and make sure that your submission has all required components. Otherwise, please add them already now to simplify the review.

A couple of comments about your paper that would be good if you could address:

lskatz commented 4 years ago

Thank you @luizirber and @peterwc for reviewing!

Thank you @csoneson! I edited the figure legends so that they have titles. I also took the text that I thought was in the figure legend and placed them into the actual legend. I will ask the bot to regenerate the pdf to make sure the formatting works. I can also remove some text from the introduction in a future edit.

lskatz commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

PDF failed to compile for issue #1762 with the following error:

/app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-efe915e61673/lib/whedon.rb:135:in check_fields': Paper YAML header is missing expected fields: tags, authors, affiliations (RuntimeError) from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-efe915e61673/lib/whedon.rb:87:ininitialize' from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-efe915e61673/lib/whedon/processor.rb:36:in new' from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-efe915e61673/lib/whedon/processor.rb:36:inset_paper' from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-efe915e61673/bin/whedon:55:in prepare' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor/command.rb:27:inrun' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in invoke_command' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor.rb:387:indispatch' from /app/vendor/bundle/ruby/2.4.0/gems/thor-0.20.3/lib/thor/base.rb:466:in start' from /app/vendor/bundle/ruby/2.4.0/bundler/gems/whedon-efe915e61673/bin/whedon:116:in<top (required)>' from /app/vendor/bundle/ruby/2.4.0/bin/whedon:23:in load' from /app/vendor/bundle/ruby/2.4.0/bin/whedon:23:in

'

lskatz commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

PDF failed to compile for issue #1762 with the following error:

Error producing PDF. ! Argument of \caption@ydblarg has an extra }.

\par l.543 } Looks like we failed to compile the PDF
lskatz commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

PDF failed to compile for issue #1762 with the following error:

Error producing PDF. ! Argument of \caption@ydblarg has an extra }.

\par l.538 for the random seed in the Mash program.} Looks like we failed to compile the PDF
lskatz commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

lskatz commented 4 years ago

I will leave it there for now and will attempt to shorten the introduction after receiving reviewer feedback.

csoneson commented 4 years ago

Hi @luizirber, @peterwc - just checking in on how the reviews are progressing. Do you think you can have comments for @lskatz within the next week or so? Don't hesitate to let me know if you have questions. Thanks!

peterwc commented 4 years ago

Working on it this week. I apologize for the delay. Expecting a Friday completion.

Thanks, Peter

On Thu, Oct 10, 2019 at 10:58 AM Charlotte Soneson notifications@github.com wrote:

Hi @luizirber https://github.com/luizirber, @peterwc https://github.com/peterwc - just checking in on how the reviews are progressing. Do you think you can have comments for @lskatz https://github.com/lskatz within the next week or so? Don't hesitate to let me know if you have questions. Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openjournals/joss-reviews/issues/1762?email_source=notifications&email_token=ABLR363IK4TNVPW2VGPZAVDQN47J7A5CNFSM4I2IFUO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEA4VFZY#issuecomment-540627687, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLR36ZAHJWW2P6MSIT762LQN47J7ANCNFSM4I2IFUOQ .

-- Peter Cook, PhD Centers for Disease Control and Prevention Influenza Division, Zoonotic Virus Team Email: pcook@cdc.gov Email: peterwcook7@gmail.com Phone: (765)490-1228

luizirber commented 4 years ago

Overall this is all very good, thanks @lskatz! Here's some review points:

Functionality

Installation

The install instructions and requirements could have a bit more information, I had to dig a bit to find proper Perl package names (for use with cpanm) and links to quicktree and mash.

Also adding that mashtree is available on bioconda!

This is what I ended up doing to install using an Ubuntu docker image and the "Installation from Git" section:

git clone https://github.com/lskatz/mashtree.git
cd mashtree
docker run -it -v $PWD:/mashtree ubuntu:latest /bin/bash

Inside docker (travis config was more useful here): https://github.com/lskatz/mashtree/blob/e9697e7496b5c6a85ae69832189ea8350cacfdfb/.travis.yml

apt update
apt install -y build-essential cpanminus libexpat1-dev wget sqlite3 libsqlite3-dev

cd /mashtree

cpanm -n BioPerl Bio::Sketch::Mash DBD::SQLite DBI Graph::Dijkstra

wget https://github.com/marbl/Mash/releases/download/v2.2/mash-Linux64-v2.2.tar
tar xf mash-Linux64-v2.2.tar
mv mash-Linux64-v2.2/mash /usr/bin/mash

wget https://github.com/khowe/quicktree/archive/v2.5.zip
tar xf v2.5.tar.gz 
cd quicktree-2.5
make
mv quicktree /usr/bin
cd ..

perl Makefile.PL
make test

Documentation

Installation instructions

(see above)

Functionality documentation

This is missing for using it as a library in Perl, but since the primary form of interaction is through the CLI I think it is OK.

Community guidelines

I didn't see it in the README, but you can also create a CONTRIBUTING file (GitHub recognizes it and render the content when a new issue or PR is opened): https://github.blog/2012-09-17-contributing-guidelines/

Software paper

References

Most of them are missing DOIs, can you please add them?

lskatz commented 4 years ago

Thank you @luizirber ! I am working on addressing these helpful comments. For the references comment, are you asking for me to add DOIs to the references in the paper itself or in the git repo (I can do either or both, just making sure).

luizirber commented 4 years ago

@lskatz put the DOIs in the paper.bib file in your repo, when the paper is regenerated by whedon they will be picked up.

lskatz commented 4 years ago

Thank you for your comments @luizirber ! I did the following:

  1. Added a committed installation document at https://github.com/lskatz/mashtree/blob/master/docs/INSTALL.md
    • This document is linked from the README, https://github.com/lskatz/mashtree/blob/master/README.md#installation
    • All instructions you suggested have been incorporated. They were very helpful, thank you! However, I switched some details so that it could be installed outside of Docker. Docker at my institution and at many other places is a security risk and so I feel like installation in a home directory is more universal.
  2. Functionality documentation: yes, I believe that since it is offered as a package in Perl that the most appropriate mode of distribution was in the native package manager, CPAN. It is not meant strictly as a library but as a CLI executable. So I think your train of thought is correct.
  3. Thank you for the community guidelines suggestion. I added guidelines per your suggestion.
  4. I added DOIs and will double check whether @whedon generates the PDF without error momentarily.
lskatz commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

peterwc commented 4 years ago

@csoneson I will continue with my review but I realized that I may be in a gray area for this review. I apologize for not bringing it up sooner. I am a bioinformatics researcher at the CDC, but I work in a different division than @lskatz. I have not been on any papers with him, but I do know him personally. After reading through the ethics guidelines I decided it would be best to ask for waiver before submitting my review. I am impartial and will provide a complete and thorough review if I am granted a waiver.

lskatz commented 4 years ago

I will ping the other potential reviewers, @csoneson. I don't anticipate anyone else will have @peterwc's concerns. What is the process for adding them as reviewers?

I also want to point out that yes @peterwc works at CDC, and I work at CDC, but we are in totally different centers and do not work with each other. His center has about 1500 people in it currently, and mine has about 2800 people in it.

csoneson commented 4 years ago

@peterwc - thanks a lot for your disclosure. After consideration, and in light of your personal connection with @lskatz, I would actually prefer to try to recruit a different reviewer for this submission; but I would like to thank you for engaging with JOSS and I hope we can get back to you for another review in the future!

csoneson commented 4 years ago

@ondovb - would you be interested in reviewing this submission for the Journal of Open Source Software?

"Mashtree: a rapid comparison of whole genome sequence files" GitHub repository: https://github.com/lskatz/mashtree

peterwc commented 4 years ago

@csoneson Thank you and please keep me in mind for other opportunities to review. I am still available for this one if things change.

lskatz commented 4 years ago

Hi @csoneson if no one responds, can I suggest these other reviewers? I picked them from the JOSS list of potential reviewers.

standage commented 4 years ago

I'd be happy to conduct a review if needed. 👍

greatfireball commented 4 years ago

Me too, just send me a note

csoneson commented 4 years ago

@whedon remove @peterwc as reviewer

whedon commented 4 years ago

OK, @peterwc is no longer a reviewer

csoneson commented 4 years ago

@standage, @greatfireball - that's great, thank you for agreeing to review, it would be great to get your input! I'll add you both as reviewers here, and also add a checklist for each of you in the first post above.

csoneson commented 4 years ago

@whedon add @standage as reviewer

whedon commented 4 years ago

OK, @standage is now a reviewer

csoneson commented 4 years ago

@whedon add @greatfireball as reviewer

whedon commented 4 years ago

OK, @greatfireball is now a reviewer

luizirber commented 4 years ago
* All instructions you suggested have been incorporated.  They were very helpful, thank you!  However, I switched some details so that it could be installed outside of Docker.  Docker at my institution and at many other places is a security risk and so I feel like installation in a home directory is more universal.

Sounds great! I did in docker to avoid having to install a lot of extra packages in my system, but doing a home directory installation is the best path (since many people won't have admin rights to their system).

1. Thank you for the community guidelines suggestion.  I added guidelines per your suggestion.
1. I added DOIs and will double check whether @whedon generates the PDF without error momentarily.

This all looks good. Thanks @lskatz!

@csoneson, checklist complete for me.

lskatz commented 4 years ago

Thank you @luizirber! And thank you to my new reviewers! I am looking forward to your feedback!

standage commented 4 years ago
---
title: 'Review of Mashtree: a rapid comparison of whole genome sequence files'
author: Daniel Standage
date: 2019-10-31
---

The authors describe Mashtree, a utility for constructing trees from sequence reads or assemblies using Min-hash sketch distances. Mashtree leverages Mash to compute distances rapidly and efficiently, and QuickTree to build a neighbor-joining tree from the resulting distance matrix. Mashtree supports two modes of operation: a "fast" mode based on a fixed minimum k-mer abundance filter, and an "accurate" mode where the filter is inferred empirically from the k-mer abundance distribution. Mashtree also implements two resampling methods for estimating the confidence of the trees it constructs.

Mashtree provides several features bioinformatics and genomics practitioners will find convenient, including the following.

I have added several comments below for the authors' consideration, but in any case I endorse this submission for publication in JOSS.

Installation

TL;DR Conda and from-source installation modes seem to work well. 👍

Mashtree's documentation describes multiple modes of installation/execution, including Bioconda, Docker, CPAN, and from the source code distribution. Installing from Bioconda on my OS X system was painless, although I was skeptical whether this reflected the latest version. Accordingly, I also installed from source (latest master on Githuub) on a fresh Ubuntu VM. This process was also fairly straightforward, although I do have a couple comments.

Best Practices

TL;DR CI GOOD 👍, MOAR PR!

The Mashtree repository includes an automated test suite, which is executed frequently in a Contintuous Integration (CI) environment. Potential users of Mashtree can be confident that bugs will be identified early and often, and hopefully most will be addressed before stable releases are published.

Pull requests and CI have been used to review 3rd-party (and possibly internal/collaborator) contributions. I'd suggest that the corresponding author consider using pull requests more routinely—i.e., for all non-trivial updates to the repository. Even for projects where I am the sole active contributor, I have found value in providing a summary and description of each update, linking to relevant issue threads as necessary, and waiting to merge the pull request until the CI build passes.

standage commented 4 years ago

@csoneson, this completes my review of Mashtree.

@lskatz, if you have any comments or questions, feel free to keep the conversation going here or on the Mashtree issue tracker—just @ me! 😎

csoneson commented 4 years ago

@csoneson, this completes my review of Mashtree.

@standage - thanks for this!

lskatz commented 4 years ago

Thank you @standage for that wonderful review! To address one concern, I think that the issue might have been that I did not have a complete cpanm command, when I should have had -l ~ in there. In other words, it should fix where the libraries should install to, $HOME/lib/perl5. So now one of the lines in INSTALL.md now reads as

cpanm -l ~ --notest BioPerl Bio::Sketch::Mash DBD::SQLite DBI Graph::Dijkstra

Fixed in 71f7470

csoneson commented 4 years ago

@greatfireball - did you have a chance to look at this submission?