NorwegianVeterinaryInstitute / ALPPACA

A tooL for Prokaryotic Phylogeny And Clustering Analysis
BSD 3-Clause "New" or "Revised" License
10 stars 3 forks source link

Comments on JOSS submission #27

Closed rcannood closed 1 year ago

rcannood commented 1 year ago

This is my review to your JOSS submission (openjournals/joss-reviews#4677). I apologise for being the typical reviewer number three, but I do provide these comments with the intention of improving the manuscript and codebase.

Manuscript: While the author's work could definitely be of value to the community, I believe the JOSS submission should be reworked to highlight the specific qualities of the work. More specifically, the manuscript focuses a lot on technical aspects, e.g. it is a Nextflow pipeline with tools wrapped in Singularity containers, it generates an HTML report, the "Tracks" section reads like a methodology section of the tools the pipeline consists of. The interesting academic parts of the manuscript would be, "What scientific problem does the work solve? How does it compare to other approaches in the field? What are the unique attributes of this work? What do the results of this pipeline look like? What can a user do with these results?"

Codebase: In addition, the current codebase is set up to be only useful for the author's specific HPC infrastructure, whereas Nextflow is designed to be used on any kind of system. As far as I know, there are no automated tests to evaluate whether the software is performing as intended.

In the comments below, I hope I have provided the seeds for how the authors might start to tackle some of the comments. I'd also be happy to hear from @csoneson on whether some of these topics are not part of my reviewing duties :)

Functionality

I did not verify the installation instructions or functionality as it requires an unreasonable amount of effort, including setting up / acquiring my own HPC infrastructure and refactoring the author's codebase to remove any mentions of the author's HPC infrastructure and replace them with my own.

If done correctly, Nextflow pipelines are generic and can be run on a variety of systems, such as HPC clusters, Kubernetes cloud clusters or simply a beefy computer. For example, I should not need to:

In my opinion, the authors should refactor the installation instructions so it lists something along the lines of:

Step 1: Create a nextflow.config with your infrastructure setup. Here is a template you can start from. Step 2: Run nextflow run NorwegianVeternaryInstitute/ALPPACA -c nextflow.config <your input parameters>.

Documentation

Software paper

hkaspersen commented 1 year ago

@rcannood Thank you so much for the valuable feedback! I will come back to you when I have necessary changes to the pipeline.

rcannood commented 1 year ago

Hey @hkaspersen! Congratulations on addressing all of my comments in such a brief period of time! :)

I believe the documentation, the software repository and the paper all look so much better now.

I have a minor comment on the structure of the manuscript, but I'll pick this up in a different issue.

rcannood commented 1 year ago

Actually, let me reopen this issue ;)

In #35 I see you added a way for testing the pipeline by defining a test profile.

I'm happy there is a test for the pipeline, however the test is not automated just yet.

To fully automate your test, you can browse some of the nf core pipelines and see how they implemented CI. For example, the isoseq pipeline has one: https://github.com/nf-core/isoseq/blob/master/.github/workflows/ci.yml.

hkaspersen commented 1 year ago

Dear @rcannood, thank you for your feedback! I am not completely sure what you mean with automated in this context. Do you mean that the user should not need to actually run the command to test the pipeline? Or is this a test run to check if the pipeline runs after I am updating it?

If you feel it is necessary for me to implement this I am happy to oblige, but I need to remind you that this is not (yet) an NF-Core pipeline, so I just want you to be absolutely sure it is needed! Could you perhaps elaborate on the usefulness of such a functionality?

Thanks in advance!

rcannood commented 1 year ago

I'm referring to the NF core pipeline as an example. An explanation on automated testing can be found in the JOSS documentation here: https://joss.readthedocs.io/en/latest/review_criteria.html#tests . The NF core pipeline I linked to is an example of an automated testing suite using GitHub actions.

hkaspersen commented 1 year ago

@rcannood I was unaware that this was a criteria from JOSS! Thank you for informing me. I will start implementing this as soon as possible.

hkaspersen commented 1 year ago

@rcannood automated test now added! It will run when i push to the dev branch.

rcannood commented 1 year ago

GJ! 😀