ropensci / software-review

rOpenSci Software Peer Review.
295 stars 104 forks source link

Presubmission inquiry: rslurm package #368

Closed qdread closed 4 years ago

qdread commented 4 years ago

Submitting Author: Quentin Read (@qdread)
Repository: https://github.com/sesync-ci/rslurm/


Package: rslurm
Type: Package
Title: Submit R Calculations to a 'Slurm' Cluster
Description: Functions that simplify submitting R scripts to a 'Slurm' 
    workload manager, in part by automating the division of embarrassingly
    parallel calculations across cluster nodes.
Acknowledgements: Development of this R package was supported by the National 
    Socio-Environmental Synthesis Center (SESYNC) under funding received from 
    the National Science Foundation grants DBI-1052875 and DBI-1639145.
Version: 0.5.0
License: GPL-3
URL: https://github.com/SESYNC-ci/rslurm
BugReports: https://github.com/SESYNC-ci/rslurm/issues
Authors@R: c(person('Philippe', 'Marchand', email = "marchand.philippe@gmail.com", role = 'aut'),
             person('Ian', 'Carroll', role = 'aut'),
             person('Mike', 'Smorul', role = 'ctb'),
             person('Rachael', 'Blake', role = 'ctb'),
             person('Quentin', 'Read', email = 'qread@sesync.org', role = c('ctb', 'cre')),
             person('Se Jong', 'Cho', role = 'art')
             )
Depends:
    R (>= 3.5.0)
Imports:
    whisker (>= 0.3)
RoxygenNote: 7.0.2
Suggests:
    parallel,
    testthat,
    knitr,
    rmarkdown
VignetteBuilder: knitr

Scope

The package allows users to write code that is executed in parallel on a Slurm cluster using mapply and lapply-like statements. Shell scripts are created and submitted behind the scenes so the user’s workflow can be completely contained in a single R script. It is essentially a wrapper for both the Slurm software and the parallel package. The rslurm package is classified as scientific software wrapper because its main use is to submit cluster jobs managed by Slurm software through R, and it is classified as workflow automation because it can be used to encapsulate an R and Slurm workflow into a single R script to facilitate automation.

The target audience is people at research institutions that have access to a high performance computing cluster with Slurm software installed. That now includes a lot of universities and government research institutions in multiple countries. Its scientific application is to simplify any type of research computing workflow done in R that includes embarrassingly parallel computations — instead of being split among multiple R scripts and shell scripts to set up the input data for the jobs (R), run the jobs (shell), then collect and process job output (back to R), it can all be done in a single R script.

Since our package was first released there is also another alternative that has been released, slurmR. I believe their functionality is largely overlapping and our package is at least as good as the other one in terms of both ease of use and performance. We are actively addressing issues and maintaining the package.

The package was originally written for people at our research center, SESYNC, which is funded by the U.S. National Science Foundation. The original devs submitted it to CRAN a couple of years ago and it picked up a fair number of users there. It languished for a while without being updated but I recently took over active maintenance of the package. I have cleared out the queue of outstanding issues and added a couple of new functions to the package. I know that the user guide for ropensci suggests submitting here first before CRAN but obviously it’s already been on CRAN for a while and we think it would be appropriate for ropensci too.

maelle commented 4 years ago

Thanks @qdread for your pre-submission inquiry! The slurmR package has a comparison table that seems to indicate it has more functionality than rslurm. Can you comment a bit on the differences? Thank you!

qdread commented 4 years ago

I think the functionality overlaps for the most part. rslurm now has functions corresponding to mapply() and lapply(), so I think it should also be listed as "yes" for the apply family (column 2 in the table). However they are correct that rslurm does not support restarting individual tasks that terminated with errors (column 1), nor does it support creating multinode clusters on Slurm so that functions from parallel can directly be used (column 3). We are considering adding the functionality to be able to rerun a subset of tasks from a job, but I don't think the multinode cluster feature is really necessary for our package. I think it's preferable to use job arrays.

In general I would argue that rslurm is simpler and more streamlined than slurmR, which I think makes it more user-friendly at the expense of possibly having a little fewer features. The goal in creating the package was to make using the Slurm cluster easier for users who are otherwise unfamiliar with how to parallelize their code.

So to sum up, I agree with their table but I think we have come down on the side of simplicity. The one feature I would consider adding would be support for rerunning jobs and automatically increasing memory or time limits for tasks that terminated due to running out of resources. That would be useful for debugging as well.

qdread commented 4 years ago

Following up on my previous answer here are two other points:

qdread commented 4 years ago

Hi and sorry for pestering, but I was curious whether submitting the package is OK based on the response I gave to the request for more comment. Thanks!!!

maelle commented 4 years ago

Thanks for commenting and sorry for the misunderstanding: I'm actually waiting for your discussion with slurmR authors (to update their comparison table) before taking a new look.

qdread commented 4 years ago

The author of slurmR already updated the comparison table, which I think is now more or less correct.

maelle commented 4 years ago

Thanks, we're discussing.

maelle commented 4 years ago

Thanks for providing more information.    As per our current policies, in particular about overlap,