GenomicDataInfrastructure / standard-operating-procedures

A repository for managing standard operating procedure (SOP) resources for the GDI project.
GNU Affero General Public License v3.0
3 stars 1 forks source link

Linter code and workflow for SOPs #22

Closed M-casado closed 1 month ago

M-casado commented 1 month ago

Summary

I created a python script sop-linter.py that contains the set of rules for linting of SOPs. These rules include things like having required sections, or the format of some tables. Given that linting rules in the script are each a method, it's easy to add/modify them at any given time. The script can be run by anyone manually, prior to creating a PR, to see if SOPs pass the validation:

# With test SOPs:
python scripts/sop-linter.py tests/ -v 1

# With all real SOPs in the repo:
python scripts/sop-linter.py sops/ -v 1

I also created a GitHub workflow lint-sops.yml that triggers at any PR targeting main or dev branches in the repository. The workflow executes the linting SOP, and will be an easy check on whether new SOPs follow format/content rules we specify in the sop-linter.py script. When executed, its visual tag (❌ or ✔️ ) will quickly raise possible issues with the format of SOPs that are being added to the project.

Types of changes

Motivation and Context

In order to add SOPs quickly to the repository, by multiple contributors, and still maintain some structure, these automatic workflows are required. With this addition, we will be able to raise possible issues at source, and keep harmony across SOPs.

References

ZenHub #268

Changes Introduced

Review

Not yet

Additional Notes

Both the code and the workflow were tested in my fork of the GDI repository (see testing PR).

This is how it will look like when we add SOPs to the repo at dev or main through PRs: image

As the name implies, we should have it as a requirement to have the linter pass all checks (✔️ ) for an SOP to be added to the repo.

Checklist:

General Compliance:

Only applicable if the PR includes new, or changes to, GDI SOPs (i.e., documents at sops/):

M-casado commented 1 month ago

SOP Index table

I created and tested (see PR) new code in order to:

Again, this code is simply to automate manual checks that in the future, when reviews are due and we're adding new SOPs, we don't want to do manually. Therefore, in this PR we have both code that lints SOPs and code that automatically creates the content table with all SOPs and their information.

Information that is present in the SOP index table (which is empty for now since we don't have any SOP public yet) can be easily modified, adding or removing columns at will. I chose the following format based on what I believe would be useful for users (i.e., GDI members) searching for GDI SOPs: Name Identifier Template version Type GDI Node Instance version Nº steps Last modified
elisavettorstensson commented 1 month ago

Unfortunately, I don't have so much technical background to review this PR. However, I believe that the idea behind the use of the code to make more automated checks (and reviews) in the future is very good!

M-casado commented 1 month ago

No worries, @elisavettorstensson - The key is also to have all of us informed about the non-technical flag that if there is a ❌ in a PR, it may need to be checked before merging.

The code itself will need to be maintained, and I can do for as long as I am part of the task. Others may want to contribute to it as well, in a technical (directly adding to it) or non-technical (letting me know of desired changes) way. If you're in the latter, and you feel like some checks/actions would be great to automate for this GH, feel free to raise them in the fortnightly meetings :)

M-casado commented 1 month ago

I'm hoping in the future to add more checks as well. There are many things that we could automate instead of having to check manually. Examples could be:

M-casado commented 1 month ago

Went over the code goals and features at today's meeting.