Closed nwiltsie closed 9 months ago
We discussed this in the Nextflow WG so I'll just put it here: for the commits that are added to the pages, we may want to go with a similar approach to Docker images where documentation is built for every tagged release and a single on-going page is kept for the current commit of the main
branch instead of accumulating separate pages for every commit that's merged into main
We discussed this in the Nextflow WG so I'll just put it here: for the commits that are added to the pages, we may want to go with a similar approach to Docker images where documentation is built for every tagged release and a single on-going page is kept for the current commit of the
main
branch instead of accumulating separate pages for every commit that's merged into main
I concur - all that would be required for that is some additional logic here to assign untagged commits as "development" and only use git describe
for tagged commits:
Makes sense, we can make that the default behavior if no objections from @uclahs-cds/nextflow-wg or @uclahs-cds/infrastructure-wg
Okay, I'm going to merge this as-is, but I've created #11 that we should address in a separate PR before this goes into use.
Description
This is a reworking of #2 that addresses the remaining comments on that PR. The below is written as if everything were new to this PR, but 99% of this is @zhuchcn's work.
This adds an action to generate a GitHub Pages documentation website from any repository with a single README.md file. You can see an example for my user repository here: https://sturdy-broccoli-yr91qq9.pages.github.io/latest/ (I have no idea where the automatically-generated "sturdy-broccoli" came from). Those documents are generated from this branch using this workflow:
Changes Since #2
Versioning
The action uses the generically-named mike to create versioned pages, available via a drop-down in the upper-left:
Those versions are the output from
git describe --tags --always
, and as my repo doesn't have any tags they just fall back to the commit hash. For something more realistic, like the align-DNA pipeline, tagged commits and untagged commits have more useful version strings:https://sturdy-broccoli-yr91qq9.pages.github.io/latest/ will always point to the latest documentation, and it would be easy to add more aliases.
Markdown Parsing
As suggested by @aholmes, I'm using MarkdownIt and MDFormat for interacting with the markdown.
Security Concerns
Again suggested by @aholmes, I tried to be incredibly paranoid about referencing files outside of the repository. Any of the following should cause errors:
docs_dir
Link Rewriting
Links (including links to images and links to other files) within the README are rewritten in the following ways. All "relative" links are resolved and are only actually treated as relative if they are within the repository.
docs/
folderdocs/
docs/imgs/
and rewrite link../../secretfile.txt
)Closes #3
Checklist
[x] This PR does NOT contain Protected Health Information (PHI). A repo may need to be deleted if such data is uploaded.
Disclosing PHI is a major problem[^1] - Even a small leak can be costly[^2].
[x] This PR does NOT contain germline genetic data[^3], RNA-Seq, DNA methylation, microbiome or other molecular data[^4].
[^1]: UCLA Health reaches $7.5m settlement over 2015 breach of 4.5m patient records [^2]: The average healthcare data breach costs $2.2 million, despite the majority of breaches releasing fewer than 500 records. [^3]: Genetic information is considered PHI. Forensic assays can identify patients with as few as 21 SNPs [^4]: RNA-Seq, DNA methylation, microbiome, or other molecular data can be used to predict genotypes (PHI) and reveal a patient's identity.
.png
, .jpeg
),.pdf
,.RData
,.xlsx
,.doc
,.ppt
, or other output files.To automatically exclude such files using a .gitignore file, see here for example.
[x] I have read the code review guidelines and the code review best practice on GitHub check-list.
[x] I have set up or verified the
main
branch protection rule following the github standards before opening this pull request.[x] The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].
[x] I have added the major changes included in this pull request to the
CHANGELOG.md
under the next release version or unreleased, and updated the date.