unitaryfund / mitiq

Mitiq is an open source toolkit for implementing error mitigation techniques on most current intermediate-scale quantum computers.
https://mitiq.readthedocs.io
GNU General Public License v3.0
363 stars 160 forks source link

Add a scheduled `linkcheck` workflow #2283

Closed purva-thakre closed 6 months ago

purva-thakre commented 7 months ago

Will add better details later.


Previous issue description

https://github.com/unitaryfund/mitiq/blob/35139fe291cc7c37e9d062a60c7a2079e0416e00/.github/workflows/build.yml#L3-L14

As the triggers for make linkcheck will be different, it might be better to define a separate workflow labeled .github/workflows/linkcheck.yml

natestemen commented 7 months ago

So ultimately the reason we use the link checker is to validate that links in our documentation are not dead (bad UX). This is most important when doing a release since a new version of the docs will be deployed to mitiq.readthedocs.io/en/stable. What if we only ran linkcheck on the PR that generates the release (i.e. the thing that comes right before a new version of the docs is shipped)?

This could be achieved by having a label (say release) which we use to label such a PR and then add an action that only runs when pull requests are labeled with such a tag. See further details here: https://stackoverflow.com/questions/62325286/run-github-actions-when-pull-requests-have-a-specific-label

One downside to this is that maintainers have to remember to add this label in order for linkcheck to run. If we went this way it would be documented, but it wouldn't happen automatically.

Alternatively, I do like the idea of running linkcheck weekly. Only downside I can think of there is that it is no ones "responsibility" when it fails. Someone has to "claim it", instead of it (annoyingly) disrupting your PR and requiring you to fix it.

purva-thakre commented 7 months ago

Only downside I can think of there is that it is no ones "responsibility" when it fails. Someone has to "claim it",

In case of a failure, the milestone manager uses a randomizer to assign it to someone?

natestemen commented 7 months ago

Or maybe it makes sense to add this to the responsibility list of the milestone manager.

cosenal commented 7 months ago

We can run linkcheck on every PR, but only make it fail the build on the PR release, I like that idea, Nate! cc @FarLab as the current milestone manager 😬

We should still do the other actions that @purva-thakre mentioned about specific links, such as, redirecting arxiv.org links to export.arxiv.org.

purva-thakre commented 7 months ago

Found a Sphinx extension to redirect the arxiv links: https://documatt.com/sphinx-reredirects/usage.html

Linkcheck also has an option for allowed redirects: https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-linkcheck_allowed_redirects

Edit: I am not too familiar with the release process for Mitiq. Going through the release notes, it says that a new tag is created for the release. I wonder if we could setup the workflow to be triggered when a new version tag is created?

If we use something like a release branch, the workflow could be triggered when a new release branch is created as well. https://docs.github.com/en/actions/using-workflows/triggering-a-workflow#example-including-branches-and-tags

nathanshammah commented 7 months ago

Thanks @purva-thakre for opening this issue. I am in favor of a change of the status quo, i.e., a high percentage of failed checks in docs CI come just from linkcheck-related DOI validation.

A step back: I think that linkcheck is mostly useful to check that links to things like internal Mitiq documentation pages or other docs are properly linked. Most of our external links are DOI or other links like arXiv that will likely stay the same "forever". A way to limit linkcheck domain to the documentation in my opinion would be fine. What about skipping the references file in the docs (there is an option,--skip-file)? And this way we also remove the conf.py list of exceptions. It may need some tidy up elsewhere to make sure the links to papers only occur in the references, but this should solve most current fails.

purva-thakre commented 7 months ago

Most of our external links are DOI or other links like arXiv that will likely stay the same "forever". A way to limit linkcheck domain to the documentation in my opinion would be fine. What about skipping the references file in the docs

Fine by me. There is indeed an option for skipping files. In this case, it would mean skipping everything related to refs.bib. If a docstring cites a reference from the bib file, it will require some additional tidying up, as you suggest.

https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-linkcheck_exclude_documents

@natestemen @cosenal Agree/Disagree with Nathan's suggestions?

Edit: Do we still want a weekly workflow or the one that gets triggered when a new release tag is created?

natestemen commented 7 months ago

I like the idea of skipping arxiv, and doi.org and other journal related links! It is their job after all to provide permanent links/redirects. We should trust them with that job. This can be achieved using a few regex's in the linkcheck_ignore list. Perhaps something like the following can replace our existing list.

linkcheck_ignore = [
    r"https://doi.org/.+",
    r"https://arxiv.org/.+",
    r"https://link.aps.org/doi/.+",
]

A step back: I think that linkcheck is mostly useful to check that links to things like internal Mitiq documentation pages

So make docs is enough to catch errors introduced in local file references. make linkcheck is not needed to catch the following situation:

  1. add [link](./doesntexist.md) anywhere to the docs
  2. run make docs
  3. get the following error
    /mitiq/docs/source/guide/pec-3-options.md:347: WARNING: 'myst' cross-reference target not found: './doesntexist.md' [myst.xref_missing]
purva-thakre commented 7 months ago

I'll edit the issue description to have linkcheck ignore all DOI, arxiv links etc.

We still need make linkcheck for the things that are an inline link [some_text](external web page).

For example, our contributing guide has most of those, I think.

https://github.com/unitaryfund/mitiq/actions/runs/8648542706/job/23712587752#step:7:746 https://mitiq.readthedocs.io/en/stable/contributing.html