NERC-CEH / rse_group

4 stars 1 forks source link

Code review for contributions to MAMBO project #30

Open metazool opened 1 month ago

metazool commented 1 month ago

As a group we've had a request to offer code review of work contributed to the MAMBO project.

For now this is a placeholder issue while we discover:

I've reached out to the PI and trying to close the loop by pointing the code contributors here. Links to repos and docs to be added as they are discovered. Date for completion of whatever can be done, is likely to be end of November 2024.

Anticipate getting out of this exercise:

metazool commented 4 days ago

Quick update on this - we've received links to three projects, one of which is accessible to read.

As submitted it needs more documentation, I've pushed back on accepting for review until we reach a certain threshold; and we should be involved in helping to define what that is in future projects that look like this! Included here is the bulk of my response by email:

https://book.the-turing-way.org/reproducible-research/reviewing/reviewing-checklist- the Turing Institute produce a decent, if idealised guide to the process of reviewing research code that we would be looking to go through.

Their checklist goes into a lot of detail, not all of which we would insist upon. For example, if someone is working in a setting where they have never been expected to produce automated tests that run and check their code, and they hadn’t been explicitly asked to do this at the project outset, then we couldn’t demand that of them (and would consider contributing some tests ourselves as part of the review process).

How much capacity do the project partners have to ask the contributors to do a bit more work to make their code reusable? At a minimum threshold needed for the results to be reproduced, code included in a research project needs:

• Description of how to build the software, and guidance for any software it depends on • Description on how to run it against the sample data provided, including any specific conditions for the environment it runs in • Description of what the intended outcome is

This doesn’t need to be a lot of overhead, or a formal document – it could be a few paragraphs in a README.txt file. I would push back against accepting anything that doesn’t have this in place. We could and ideally should ask for more thoroughness, go further into that Turing checklist! Documentation within the code itself about what each function does; a standard layout aid readability; and crucially, a set of code tests that check its output, check it handles bad data properly, and illustrate its inner workings. But if this wasn’t requested up front, and would cause extra work the project partners don’t have resources for, then we probably can’t expect it now.