openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
725 stars 38 forks source link

[PRE REVIEW]: DisTRaX: Accelerating High Performance Compute Processing #6013

Closed editorialbot closed 11 months ago

editorialbot commented 1 year ago

Submitting author: !--author-handle-->@gmw99<!--end-author-handle-- (Gabryel Mason-Williams) Repository: https://github.com/rosalindfranklininstitute/DisTRaX Branch with paper.md (empty if default branch): paper Version: v1.0.0 Editor: Pending Reviewers: Pending Managing EiC: Daniel S. Katz

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/12d6b40eb05749e9c1367afa25e1dd9e"><img src="https://joss.theoj.org/papers/12d6b40eb05749e9c1367afa25e1dd9e/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/12d6b40eb05749e9c1367afa25e1dd9e/status.svg)](https://joss.theoj.org/papers/12d6b40eb05749e9c1367afa25e1dd9e)

Author instructions

Thanks for submitting your paper to JOSS @gmw99. Currently, there isn't a JOSS editor assigned to your paper.

@gmw99 if you have any suggestions for potential reviewers then please mention them here in this thread (without tagging them with an @). You can search the list of people that have already agreed to review and may be suitable for this submission.

Editor instructions

The JOSS submission bot @editorialbot is here to help you find and assign reviewers and start the main review. To find out what @editorialbot can do for you type:

@editorialbot commands
editorialbot commented 1 year ago

Hello human, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 1 year ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.05 s (2007.2 files/s, 106553.6 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          48            651           1003           1877
reStructuredText                38            393            485            271
TeX                              1              7              0             75
YAML                             2              6              1             69
Markdown                         2             27              0             67
DOS Batch                        1              8              1             26
make                             1              4              7              9
TOML                             1              0              0              3
-------------------------------------------------------------------------------
SUM:                            94           1096           1497           2397
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 1 year ago

Wordcount for paper.md is 1483

editorialbot commented 1 year ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.14569/IJACSA.2016.070211 is OK
- 10.48550/arXiv.2212.03054 is OK
- 10.48550/arXiv.1610.08015 is OK

MISSING DOIs

- None

INVALID DOIs

- None
editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

editorialbot commented 1 year ago

Five most similar historical JOSS papers:

FRIEDA: Flexible Robust Intelligent Elastic Data Management Framework Submitting author: @dghoshal-lbl Handling editor: @acabunoc (Retired) Reviewers: @krother Similarity score: 0.8028

DARE Platform: a Developer-Friendly and Self-Optimising Workflows-as-a-Service Framework for e-Science on the Cloud Submitting author: @iaklampanos Handling editor: @danielskatz (Active) Reviewers: @rafaelfsilva, @Himscipy Similarity score: 0.7996

DataLad: distributed system for joint management of code, data, and their relationship Submitting author: @yarikoptic Handling editor: @arokem (Retired) Reviewers: @szorowi1, @jkanche Similarity score: 0.7996

Launcher: A simple tool for executing high throughput computing workloads Submitting author: @lwilson Handling editor: @danielskatz (Active) Reviewers: @kc9qey Similarity score: 0.7985

hotsub: A batch job engine for cloud services with ETL framework Submitting author: @otiai10 Handling editor: @brainstorm (Retired) Reviewers: @reisingerf Similarity score: 0.7982

⚠️ Note to editors: If these papers look like they might be a good match, click through to the review issue for that paper and invite one or more of the authors before considering asking the reviewers of these papers to review again for JOSS.

danielskatz commented 1 year ago

@GMW99 - thanks for your submission. Before we continue, I have some concerns:

  1. Please add to your README, as it will not pass the JOSS review criterion as it is.
  2. Please make a more substantive case in the paper for this being research software, rather than infrastructure software. Do you expect that researchers would cite this software in their papers? If so, which types of researchers?
  3. In the paper, please provide some discussion about other solutions to this need and competing packages.
  4. I don't see the figures in the paper.

If you do make additions to the paper, you may want to remove some of the current text so that it doesn't get to be too long. Perhaps some text can be replaced by a pointer to documentation?

After you have made changes in the .md file, use the command @editorialbot generate pdf to make a new PDF. editorialbot commands need to be the first entry in a new comment. If you make changes in the references, please use the command @editorialbot check references to check them.

danielskatz commented 1 year ago

👋 @GMW99 - Did you see my comments/requests above ☝️ ?

danielskatz commented 1 year ago

@GMW99 - If I don't hear back from you in the next 2 weeks, I'll mark this paper as rejected, but you can certainly address these issues and the resubmit, if you choose to.

GMW99 commented 1 year ago

Hi @danielskatz,

Sorry for the delayed response. I only work one day a week. I am currently on holiday and hope to get back to you with a complete response to your comments in the next three weeks.

Again, apologies for the delay in my response.

Kind regards

Gabryel

danielskatz commented 11 months ago

@GMW99 - it's now been three weeks - do you have any update?

I think the right thing to do is to mark this as withdrawn, with the idea that you can address the issues I mentioned above, and resubmit at a later point. What do you think?

GMW99 commented 11 months ago

@editorialbot generate pdf

editorialbot commented 11 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

editorialbot commented 11 months ago

Five most similar historical JOSS papers:

Launcher: A simple tool for executing high throughput computing workloads Submitting author: @lwilson Handling editor: @danielskatz (Active) Reviewers: @kc9qey Similarity score: 0.8151

FRIEDA: Flexible Robust Intelligent Elastic Data Management Framework Submitting author: @dghoshal-lbl Handling editor: @acabunoc (Retired) Reviewers: @krother Similarity score: 0.8134

DARE Platform: a Developer-Friendly and Self-Optimising Workflows-as-a-Service Framework for e-Science on the Cloud Submitting author: @iaklampanos Handling editor: @danielskatz (Active) Reviewers: @rafaelfsilva, @Himscipy Similarity score: 0.8124

DataLad: distributed system for joint management of code, data, and their relationship Submitting author: @yarikoptic Handling editor: @arokem (Retired) Reviewers: @szorowi1, @jkanche Similarity score: 0.8084

hotsub: A batch job engine for cloud services with ETL framework Submitting author: @otiai10 Handling editor: @brainstorm (Retired) Reviewers: @reisingerf Similarity score: 0.8078

⚠️ Note to editors: If these papers look like they might be a good match, click through to the review issue for that paper and invite one or more of the authors before considering asking the reviewers of these papers to review again for JOSS.

GMW99 commented 11 months ago

Hi @danielskatz

First of all, thank you for your patience.

The following is our response to your concerns surrounding the paper (https://github.com/openjournals/joss-reviews/issues/6013#issuecomment-1790874390) :

  1. I have updated the README to be more expansive and provide a more straightforward scientific use case. (https://github.com/rosalindfranklininstitute/DisTRaX)
  2. We would envisage researchers running this software themselves and using it as part of assembling rapid software pipelines, as there are lots of researchers developing workflow software in many fields (including cryo-em). Therefore, we expect they will deploy this software as part of their workflow to set up a temporary in-memory shared disk to speed up IO during the workflow run. The idea and software presented are particularly relevant to cloud (or cloud-like) clusters, where researchers would have root access and would define the complete "digital instrument" to run the workflow as a combined piece of workflow software + software-defined infrastructure cluster. For these reasons, we argue that it is research software. We would, therefore, expect it to be cited like other foundational software, e.g. numpy, pytorch etc., e.g. It would not be cited normally as it would be invisible to the software users, but it would be in the software inventory of workflows that build upon this.
  3. Currently, we know of no software solutions that would compete with DisTRaX. In the paper, we compare to standard deployment tools such as Ansible and find that it outperforms them in the HPC setting due to their sequential nature and the increased deployment time this would introduce. The other method of doing this is to use the hardware solution of adding a file system to your cluster. We state in the paper that this adds expense, security and complexity for cloud-based clusters and traditional HPC, and it puts a bottleneck on high I/O processes. DisTRaX breaks this need making I/O simpler on clusters by using available RAM. We think it is made clear and effectively communicated, although the lack of figures may have hampered this.
  4. My apologies; you should be able to see them now.

Again apologies for our delay in replying to you

Kind regards

Gabryel

danielskatz commented 11 months ago

@editorialbot check repository

editorialbot commented 11 months ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.07 s (1426.1 files/s, 76495.9 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          48            651           1003           1877
reStructuredText                38            393            485            271
Markdown                         2             47              0             99
TeX                              1              7              0             75
YAML                             2              6              1             69
DOS Batch                        1              8              1             26
make                             1              4              7              9
TOML                             1              0              0              3
-------------------------------------------------------------------------------
SUM:                            94           1116           1497           2429
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 11 months ago

Wordcount for paper.md is 1483

danielskatz commented 11 months ago

👋 @GMW99 - thanks for all the changes. I'm now going to ask the editors to confirm that this is research software as defined by JOSS. You should hear back in a week or two (or perhaps after the holidays).

danielskatz commented 11 months ago

@editorialbot query scope

editorialbot commented 11 months ago

Submission flagged for editorial review.

danielskatz commented 11 months ago

@GMW99 - I'm sorry to say (and also sorry for the holiday delay) that after discussion amongst the JOSS editors, we have decided that this submission is not research software as defined by JOSS. This does not mean that it is not software that is useful in research, but just that JOSS does not consider it in scope for review as research software. Please see https://joss.readthedocs.io/en/latest/submitting.html#other-venues-for-reviewing-and-publishing-software-packages for other suggestions for how you might receive credit for your work.

danielskatz commented 11 months ago

@editorialbot reject

editorialbot commented 11 months ago

Paper rejected.