openjournals / joss

The Journal of Open Source Software
https://joss.theoj.org
MIT License
1.56k stars 187 forks source link

Require tests for falsifiable claims in manuscript/docs? #932

Open sneakers-the-rat opened 3 years ago

sneakers-the-rat commented 3 years ago

Hello! New around here, love the journal, think this system is great, but have been thinking about something... searched in previous issues and didn't find something addressing this, so forgive me if this has already been discussed. I'm making this suggestion in a sort of good-faith questioning kind of way, not a hard recommend or demand or anything, so interested to see what y'all think :)

I appreciate the lightweight nature of JOSS papers, and that scientific software papers don't need to (and probably shouldn't) adopt every practice of traditional scientific papers, but I think we are missing one kind of citation in the review requirements. Scientific papers typically require that all specific, falsifiable statements of fact be accompanied by a citation to another paper or substantiating data within the paper. I think the natural analogy for scientific software papers would be to require a link to a (passing) test for specific factual claims made in the manuscript.

Current Status

The Review criteria require:

Functionality

Reviewers are expected to install the software they are reviewing and to verify the core functionality of the software.

Tests

Authors are strongly encouraged to include an automated test suite covering the core functionality of their software.

Good: An automated test suite hooked up to an external service such as Travis-CI or similar
OK: Documented manual steps that can be followed to objectively check the expected functionality of the software (e.g., a sample input file to assert behavior)
Bad (not acceptable): No way for you, the reviewer, to objectively assess whether the software works

and the Review Checklist specifies:

  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Room for Growth

"core functionality," used in multiple places, is understandably vague. One way of isolating some of what the core functionality of the package might be the specific claims that the authors make about the functionality. "functional claims" in the comes close to this, but that too needs a bit more specificity.

the checklist item for tests currently is just a confirmation that there are indeed some tests, but does not specify eg. coverage or some minimal criteria they need to satisfy. I see how it's tricky to articulate some minimum, because testing is subtle, programmatic metrics like lines of code covered don't really reflect the quality of tests, and I certainly don't think every line of code should need to be tested to make it through review. The review criteria for tests is a bit more descriptive, but it too has a bit of ambiguity in 'core functionality' and 'expected functionality.'

One way of improving here might be to link them -- a) specify that 'core functionality' is constituted by the specific things that the authors claim the software can do, and b) set the minimum standard for tests be that they test all of the claims specified as core functionality.

Suggestion

Add sections to the description of the JOSS paper, review criteria, and review checklist that require a reference to a passing test for specific, falsifiable claims about the functionality of the software made in the manuscript.

I imagine several variations and caveats would be needed:

arfon commented 3 years ago

@sneakers-the-rat - thanks for posting this. Overall I like the suggestion here and I think it would for sure be desirable to encourage some kind of 'proof' that the claims being made are substantiated somewhere in the codebase.

I'm a little weary of adding an additional required section to the JOSS paper, but instead have been thinking about what's the simplest, most actionable language we could include in the JOSS review criteria. What do you think about:

  • Functionality claims: Is it possible to verify any specific claims made in the manuscript about the functionality of the software? (e.g., performance, easy of use, ...).

With this change we could remove the current criterion (which is now duplicative):

  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)