openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
722 stars 38 forks source link

[REVIEW]: Multilocus sequence typing by blast from de novo assemblies against PubMLST #118

Closed whedon closed 7 years ago

whedon commented 8 years ago

Submitting author: @andrewjpage (Andrew Page) Repository: https://github.com/sanger-pathogens/mlst_check Version: v2.1.1630910 Editor: @pjotrp Reviewer: @harmn Archive: 10.6084/m9.figshare.4285097.v1

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/0b801d23613c9b626c2b6028f8c14056"><img src="http://joss.theoj.org/papers/0b801d23613c9b626c2b6028f8c14056/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/0b801d23613c9b626c2b6028f8c14056/status.svg)](http://joss.theoj.org/papers/0b801d23613c9b626c2b6028f8c14056)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer questions

Conflict of interest

General checks

Functionality

Documentation

Software paper

Paper PDF: 10.21105.joss.00118.pdf

whedon commented 8 years ago

Hello human, I'm @whedon. I'm here to help you with some common editorial tasks for JOSS.

For a list of things I can do to help you, just type:

@whedon commands
arfon commented 8 years ago

👋 @harmn - this is where the review for this submission take place. @pjotrp can help you with any questions you might have.

happykhan commented 8 years ago

I can confirm that the program works as described, on real world data and meets all of the requirements stated here.

andrewjpage commented 7 years ago

Thanks @happykhan If you need some alternative suggestions for reviewers, heres some researchers who have written MLST code & are active on GitHub (and I have never published with):

@lskatz (CDC - Centers for Disease Control) https://github.com/lskatz/lyve-MLST @aunderwo (PHE - Public Health England) https://github.com/phe-bioinformatics/MOST @jacarrico (Universidade de Lisboa, Portugal) https://github.com/jacarrico/MLSTtrimm & ReMatCh

pjotrp commented 7 years ago

Relax @andrewjpage. @harmn has agreed to review.

harmn commented 7 years ago

indeed, looking for some time to get on it.

Harm

On 23 Nov 2016, at 10:35, Pjotr Prins notifications@github.com<mailto:notifications@github.com> wrote:

Relax @andrewjpagehttps://github.com/andrewjpage. @harmnhttps://github.com/harmn has agreed to review.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/openjournals/joss-reviews/issues/118#issuecomment-262468060, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABZuOe8q1GPU3GY44i4oq1y0rbqx0WCeks5rBAjTgaJpZM4Kxz06.

harmn commented 7 years ago

I reviewed mlst_check: it installs and runs. Some issues/comments below, in general the documentation is a bit limited.

Installation instructions Installation works fine on Ubuntu 16.04.1 LTS, but it does require root access/sudo rights or a docker installation. Might be good to mention that.

Example usage I could not find the sample files described in the README.md file for the docker container (/data/sample1.fa etc.). I used the Salmonella files in "mlst_check/example/input_data" to test the software.

Documentation The used methods are poorly described (as far as I can tell), for instance I had to look in the docstrings to find out what is used to find the best hit: "The best hit has the greatest number of matching bases. If two hits have the same number of matching bases, the one with the greater percentage identity is selected."

I had to search the code for the meaning of the tilde in the ST column, "Add tilde to matches which are not 100%". Would be good to add that to the README.md.

What is exactly expected in the input FASTA files (I assume complete genomes)? What is the output if your sequences have no hits with the selected scheme?

Automated tests There are quite a number test scripts, but I could not find any description of the tests. I ran all and found no problems.

References Page et al. does not have a doi (should be http://dx.doi.org/10.1099/mgen.0.000083)

pjotrp commented 7 years ago

@andrewjpage Do you mind addressing mentioned issues?

andrewjpage commented 7 years ago

@harmn Thanks for the comprehensive review. @pjotrp Yes I will address all of the issues, but it will take me a few days.

andrewjpage commented 7 years ago

Hi @harmn Thanks for reviewing this submission.

Installation instructions Installation works fine on Ubuntu 16.04.1 LTS, but it does require root access/sudo rights or a docker installation. Might be good to mention that.

I have clarified that root access is needed, and also added instructions for installing with HomeBrew/LinuxBrew (which doesn't need root).

Example usage I could not find the sample files described in the README.md file for the docker container (/data/sample1.fa etc.). I used the Salmonella files in "mlst_check/example/input_data" to test the software.

I have added test files to the docker container so that the example command works out of the box.

Documentation The used methods are poorly described (as far as I can tell), for instance I had to look in the docstrings to find out what is used to find the best hit: "The best hit has the greatest number of matching bases. If two hits have the same number of matching bases, the one with the greater percentage identity is selected."

I had to search the code for the meaning of the tilde in the ST column, "Add tilde to matches which are not 100%". Would be good to add that to the README.md.

What is exactly expected in the input FASTA files (I assume complete genomes)? What is the output if your sequences have no hits with the selected scheme?

I have extended README documentation to clarify the methods used, the meaning of the various notations and extended the examples.

Automated tests There are quite a number test scripts, but I could not find any description of the tests. I ran all and found no problems.

By default the tests produce minimal output when running, but can be run in a more verbose mode 'dzil test --test-verbose' which gives a text string for each test. I have improved the messages coming out, and added more descriptive text to blocks of tests to make it clearer whats going on. Also I have set TravisCI to run the tests in verbose mode so that its easier to see.

References Page et al. does not have a doi (should be http://dx.doi.org/10.1099/mgen.0.000083)

Thanks this has been corrected.

harmn commented 7 years ago

@andrewjpage nice work @pjotrp happy to accept this

pjotrp commented 7 years ago

@arfon ready to accept.

arfon commented 7 years ago

@andrewjpage - At this point could you make an archive of the reviewed software in Zenodo/figshare/other service and update this thread with the DOI of the archive? I can then move forward with accepting the submission.

andrewjpage commented 7 years ago

I've added it to Figshare as: https://dx.doi.org/10.6084/m9.figshare.4285097.v1

arfon commented 7 years ago

@whedon set 10.6084/m9.figshare.4285097.v1 as archive

whedon commented 7 years ago

OK. 10.6084/m9.figshare.4285097.v1 is the archive.

arfon commented 7 years ago

Many thanks for the review @harmn and for editing this one @pjotrp.

@andrewjpage - your paper is now accepted into JOSS and your DOI is http://dx.doi.org/10.21105/joss.00118 ⚡️ 🚀 💥

genomematt commented 7 years ago

There are unformatted citations in the paper @arfon @pjotrp [Seeman2016; Jolley2010]

andrewjpage commented 7 years ago

Sorry Matt, I've fixed it now in the original markdown file.

arfon commented 7 years ago

Thanks for the heads up @genomematt - this should be fixed now.