anishathalye / proof-html

A GitHub Action to validate HTML, check links, and more ✅
MIT License
58 stars 21 forks source link

html-proofer seems to be run multiple times #10

Open franke-biosaxs opened 4 months ago

franke-biosaxs commented 4 months ago

Hi there.

I've integrated the proof-html action into my deploy workflow and, after some trial-and-error got everything to work. Thanks for that!

However, I did notice that the output of html-proofer seems to come in triplicate, i.e. this is repeated three times:

Running 4 checks (Images, Links, OpenGraph, Scripts) in ["./_site"] on *.html files ... [snip errors] HTML-Proofer found XXX failures!

The number of errors and error details are identical in each copy of the output. I can not say whether simply the output is echoed multiple times, or whether html-proofer is being run multiple times. It does not seem to have any negative impact besides a noisy log, though.

The workflow snippet:

      - name: Check URLs
        uses: anishathalye/proof-html@v2
        continue-on-error: true
        with:
          directory: ./_site
          enforce_https: true
          check_favicon: false
          tokens: |
            {"https://github.com": "${{ secrets.GITHUB_TOKEN }}"}
          ignore_url_re: |
            ^https://twitter.com/
          swap_urls: |
            {"^/atsas": ""}

Btw, the README of html-proofer states:

Also make sure that your later step which runs HTMLProofer will not return a failed shell status. You can try something like html-proof ... || true. Because a failed step in GitHub Actions will skip all later steps.

Which is exactly what happened for me until I added continue-on-error: true. Issue was, I ran into a loop-situation of html-proofer complaining about missing external files, which stopped the workflow that would have deployed those files in the first place. Took me a while to find, maybe you could add this option to your examples? Thanks!

anishathalye commented 4 months ago

proof-html runs HTMLProofer a couple times (with cache enabled) to retry external URLs; see the retry option.

The intended use of proof-html is to get a CI failure when something goes wrong, so you get notified and fix the issue. What's the use case where you want to set continue-on-error?

In all our examples, such as this, we run proof-html as the last step in a workflow.

franke-biosaxs commented 4 months ago

Didn't realise that retries would actually re-run the whole process multiple times. I assumed it would just do that, retry N times, then report "not found", but only once.

I took the base workflow directly from one of the GitHub workflow examples to get started. In this example workflow, the first job builds the site and uploads an artefact, the second job downloads the artefact and deploys to Pages. I've added the "Check URL" step to the build job. If it fails, the secondary deploy job doesn't happen. And as described, if new files shall be deployed, they are not found during checking, hence html-proof fails, thus no deployment.

anishathalye commented 4 months ago

Yeah, the output isn't the cleanest, proof-html currently shows the output from all the HTMLProofer runs. You only need to look at the last one, though.

And as described, if new files shall be deployed, they are not found during checking, hence html-proof fails, thus no deployment.

Are these files linked with an absolute URL, but you could use a relative URL while checking, so the link would be resolved locally? See the swap_urls option.

franke-biosaxs commented 4 months ago

Are these files linked with an absolute URL, but you could use a relative URL while checking, so the link would be resolved locally?

I checked, they were all of the type:

At ./_site/...html:12: External link https://...html failed (status code 404)

Where line 12 is (generated):

\<link rel="canonical" href="https://...html" />

So, yes, I as far as I understand, swap_urls would probably solve this. That said, this is all work in progress and there still dozens of other errors left that would stop deployment. Hence the override - which is in line with the html-proofer documentation.