add workflow for link checking and spelling

bollwyvl commented 4 years ago

fixes #56

This adds a new workflow which performs:

spellchecking with hunspell against en-US and en-GB (the latter is more frequently updated)
- this gets installed from conda-forge
- findings
- nothing too surprising: Jupyter, Xeus, etc. aren't in the dictionary... yet!
- otherwise, awesome! good job team!
link checking with pytest-check-links
- this gets installed with pip
- it's not currently checking anchors, because we haven't released a version that supports the anchors that sphinx generates (my fault)
- findings
- a bunch of the declarativewidgets links are broken. they just get skipped for now

I've tried some caching (still learning actions), and the total delta seems not so long. I didn't touch the existing workflows, but they should probably share requirements files. It has been suggested that we might get reusable step definitions (a la azure pipelines) in the future.

bollwyvl commented 4 years ago

On Fri, Jun 19, 2020, 10:36 Chris Holdgraf notifications@github.com wrote:

@choldgraf commented on this pull request.

A couple quick comments in there - thanks for taking a stab at building some infra for QC!

In .github/workflows/check.yml https://github.com/jupyter/enhancement-proposals/pull/58#discussion_r442875875 :

restore-keys: |

pip-check-links-

pip-

name: Set up Python 3.7

uses: actions/setup-python@v1

with:

python-version: 3.7

name: Install dependencies

run: |

pip install -U -r .ci_support/requirements-check-links.txt

name: Find broken links

run: |

bash .ci_support/check_links.sh

Another option is that we could use the link checker from Sphinx. Jupyter Book supports users calling it with jupyter-book build . --builder linkcheck. It's a little finnicky in my experience, but maybe that's true of all link checkers. Could simply the build process a bit?

not sure if it caches: in ci, it's easy to get api limited. Gh in particular.

on another project with a heavily customized Sphinx build, I came to the realization that i don't care how the link gets there, or what it is in some intermediate form: all that matters is what ends up in the html, and for all asset types, including those injected by themes, plugins, etc. that might occur after the Sphinx link checker runs.

In .github/workflows/check.yml https://github.com/jupyter/enhancement-proposals/pull/58#discussion_r442876400 :

restore-keys: |

conda-spelling-

conda-

name: Install dependencies

uses: goanpeca/setup-miniconda@v1.6.0

with:

activate-environment: spelling

channel-priority: strict

environment-file: .ci_support/environment-spelling.yml

use-only-tar-bz2: true

name: Find misspelled words

shell: bash -l {0}

run: |

bash .ci_support/check_spelling.sh

What is the likelihood that this is going to be triggered on a regular basis just because people are using non-standard words in JEPs etc? I'm a bit concerned that this is going to feel like a nagging bot that keeps telling people to slightly change wording that they think is correct. Do you know what I mean?

High. But that's that point. Heavy use of unexplained jargon is not helpful in a spec. If a JEP introduces new terms, then they are new terms, and there's a section for it called out in the template. We could fiat all such terms directly, but only if they are headings under the the section in question. Xeus would be a good example of this.

We could grep out capital words/mixed case words, which would catch names... But then a typo in a name elsewhere would not be caught. And that makes it harder to catch Jupyterhub vs JupyterHub.

In .github/workflows/check.yml https://github.com/jupyter/enhancement-proposals/pull/58#discussion_r442876573 :

restore-keys: |

pip-build-

pip-

name: Install dependencies

run: |

pip install -U -r .ci_support/requirements-build.txt

name: Build the book

run: |

jupyter-book build .

name: Upload book

uses: actions/upload-artifact@v2

with:

name: _build

wow I didn't realize it was so easy to persist artifacts in GHA haha

Yeah, it's decent... But not accessible between different workflow yaml files. And they flake some time. And you can't "needs" them, just a whole "job," which is kind of a bummer.

—

You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jupyter/enhancement-proposals/pull/58#pullrequestreview-434125729, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAALCRGH6UOM6DH2P3GXPPTRXNZVZANCNFSM4OCMCRKQ .

choldgraf commented 4 years ago

yeah that makes sense - in that case I'm +1 on trying out this infrastructure and seeing how it feels in practice. Good point about the linkcheck cacheing and rate limits.

also just a note that when you reply to these via email, it's injecting a ton of extra stuff into your comments :-)

jupyter / enhancement-proposals

add workflow for link checking and spelling #58

A couple quick comments in there - thanks for taking a stab at building some infra for QC!