uclahs-cds / tool-Nextflow-action

GNU General Public License v2.0
0 stars 0 forks source link

Add (reworked) action to build and deploy documentation website #10

Closed nwiltsie closed 9 months ago

nwiltsie commented 9 months ago

Description

This is a reworking of #2 that addresses the remaining comments on that PR. The below is written as if everything were new to this PR, but 99% of this is @zhuchcn's work.

This adds an action to generate a GitHub Pages documentation website from any repository with a single README.md file. You can see an example for my user repository here: https://sturdy-broccoli-yr91qq9.pages.github.io/latest/ (I have no idea where the automatically-generated "sturdy-broccoli" came from). Those documents are generated from this branch using this workflow:

---
name: Build and Deploy Docs

on:
  workflow_dispatch:
  push:
    branches:
      - add_action

jobs:
  build:
    name: Deploy docs
    runs-on: ubuntu-latest
    steps:
      - name: Checkout main
        uses: actions/checkout@v3

      - name: Deploy docs
        uses: uclahs-cds/tool-Nextflow-action/build-and-deploy-docs@nwiltsie_hacking

Changes Since #2

Versioning

The action uses the generically-named mike to create versioned pages, available via a drop-down in the upper-left:

image

Those versions are the output from git describe --tags --always, and as my repo doesn't have any tags they just fall back to the commit hash. For something more realistic, like the align-DNA pipeline, tagged commits and untagged commits have more useful version strings:

pipeline-align-DNA $ git describe --tags --always 8e8433c
v9.0.0-53-g8e8433c
pipeline-align-DNA $ git describe --tags --always 589bb5f
v10.0.0-rc.1

https://sturdy-broccoli-yr91qq9.pages.github.io/latest/ will always point to the latest documentation, and it would be easy to add more aliases.

Markdown Parsing

As suggested by @aholmes, I'm using MarkdownIt and MDFormat for interacting with the markdown.

Security Concerns

Again suggested by @aholmes, I tried to be incredibly paranoid about referencing files outside of the repository. Any of the following should cause errors:

Link Rewriting

Links (including links to images and links to other files) within the README are rewritten in the following ways. All "relative" links are resolved and are only actually treated as relative if they are within the repository.

Type Action
Anything with a scheme (https, ftp, etc.) Leave as-is
Relative link within docs/ folder Rewrite to remove docs/
Relative link to an image Copy image to docs/imgs/ and rewrite link
Other relative links Deep-link to the file on GitHub (see this page for an example)
Anchor links Rewrite to correct for split pages
Fake relative paths (../../secretfile.txt) Leave as-is (will be a broken link)

Closes #3

Checklist

[^1]: UCLA Health reaches $7.5m settlement over 2015 breach of 4.5m patient records [^2]: The average healthcare data breach costs $2.2 million, despite the majority of breaches releasing fewer than 500 records. [^3]: Genetic information is considered PHI. Forensic assays can identify patients with as few as 21 SNPs [^4]: RNA-Seq, DNA methylation, microbiome, or other molecular data can be used to predict genotypes (PHI) and reveal a patient's identity.

  To automatically exclude such files using a .gitignore file, see here for example.

yashpatel6 commented 9 months ago

We discussed this in the Nextflow WG so I'll just put it here: for the commits that are added to the pages, we may want to go with a similar approach to Docker images where documentation is built for every tagged release and a single on-going page is kept for the current commit of the main branch instead of accumulating separate pages for every commit that's merged into main

nwiltsie commented 9 months ago

We discussed this in the Nextflow WG so I'll just put it here: for the commits that are added to the pages, we may want to go with a similar approach to Docker images where documentation is built for every tagged release and a single on-going page is kept for the current commit of the main branch instead of accumulating separate pages for every commit that's merged into main

I concur - all that would be required for that is some additional logic here to assign untagged commits as "development" and only use git describe for tagged commits:

https://github.com/uclahs-cds/tool-Nextflow-action/blob/ff81b91b5d31f15aed0d7864814ccd1d27812aab/build-and-deploy-docs/action.sh#L21-L27

yashpatel6 commented 9 months ago

Makes sense, we can make that the default behavior if no objections from @uclahs-cds/nextflow-wg or @uclahs-cds/infrastructure-wg

nwiltsie commented 9 months ago

Okay, I'm going to merge this as-is, but I've created #11 that we should address in a separate PR before this goes into use.