Code review for contributions to MAMBO project

metazool commented 4 months ago

As a group we've had a request to offer code review of work contributed to the MAMBO project.

For now this is a placeholder issue while we discover:

How much of our time the project budget has got earmarked for the engagement
What the volume and quality of code is like in outline, whether a considered review fits in the maximum time available
Any specific needs (documentation, process) that make having a code review a condition of the project

I've reached out to the PI and trying to close the loop by pointing the code contributors here. Links to repos and docs to be added as they are discovered. Date for completion of whatever can be done, is likely to be end of November 2024.

Anticipate getting out of this exercise:

Identify through practise where workshops or hands-on support would benefit research coders (testing, CI, MLOps, other)
Opportunity for more than one person to work on this together, learn from one another
Insight into the domain, what's happening Europe-wide in biodiversity monitoring?

metazool commented 2 months ago

Quick update on this - we've received links to three projects, one of which is accessible to read.

As submitted it needs more documentation, I've pushed back on accepting for review until we reach a certain threshold; and we should be involved in helping to define what that is in future projects that look like this! Included here is the bulk of my response by email:

https://book.the-turing-way.org/reproducible-research/reviewing/reviewing-checklist- the Turing Institute produce a decent, if idealised guide to the process of reviewing research code that we would be looking to go through.

Their checklist goes into a lot of detail, not all of which we would insist upon. For example, if someone is working in a setting where they have never been expected to produce automated tests that run and check their code, and they hadn’t been explicitly asked to do this at the project outset, then we couldn’t demand that of them (and would consider contributing some tests ourselves as part of the review process).

How much capacity do the project partners have to ask the contributors to do a bit more work to make their code reusable? At a minimum threshold needed for the results to be reproduced, code included in a research project needs:

• Description of how to build the software, and guidance for any software it depends on • Description on how to run it against the sample data provided, including any specific conditions for the environment it runs in • Description of what the intended outcome is

This doesn’t need to be a lot of overhead, or a formal document – it could be a few paragraphs in a README.txt file. I would push back against accepting anything that doesn’t have this in place. We could and ideally should ask for more thoroughness, go further into that Turing checklist! Documentation within the code itself about what each function does; a standard layout aid readability; and crucially, a set of code tests that check its output, check it handles bad data properly, and illustrate its inner workings. But if this wasn’t requested up front, and would cause extra work the project partners don’t have resources for, then we probably can’t expect it now.

metazool commented 1 month ago

Projects in the first set of MAMBO outcomes (for November, likely more to follow)

NERC-CEH / rse_group

Code review for contributions to MAMBO project #30