`linkcheck` build never ends

dbitouze commented 8 months ago

Describe the bug

The execution of make linkcheck never ends: the only way to stop it is Ctrl+C.

How to Reproduce

Not sure how to reproduce: we have about 1200 .md source files.

Environment Information

Platform:              linux; (Linux-6.5.13-desktop-6.mga9-x86_64-with-glibc2.36)
Python version:        3.10.11 (main, Apr 16 2023, 03:21:15) [GCC 12.2.1 20230415])
Python implementation: CPython
Sphinx version:        7.2.6
Docutils version:      0.19
Jinja2 version:        3.1.2
Pygments version:      2.16.1

Sphinx extensions

`myst_parser`

Additional context

No response

picnixz commented 8 months ago

Does it work with less files? can we know which link is being tested? Since the CI/CD works, it's impossible for us to investigate more unless we know what to look for.

dbitouze commented 8 months ago

Does it work with less files?

With less (112) files, it almost works. Run with the -vv verbose option, I get:

[...]
[app] emitting event: 'build-finished'(None,)
sphinx-sitemap: No pages generated for sitemap.xml
build finished with problems, 371 warnings.
make: *** [Makefile:20: linkcheck] Error

can we know which link is being tested?

Sorry, I don't see what you mean.

Since the CI/CD works, it's impossible for us to investigate more unless we know what to look for.

I can imagine, sorry for not being able to provide more informations.

AA-Turner commented 8 months ago

Without a repository to reproduce the error, I'm forced to close -- sorry.

A

dbitouze commented 8 months ago

Without a repository to reproduce the error, I'm forced to close -- sorry.

Ah, sorry, I didn't know that I could just let you clone our repository for testing. Here is are the steps to reproduce the error:

git clone https://gitlab.gutenberg-asso.fr/gutenberg/faq-gut.git
# or:
# git clone ssh://git@gitlab.gutenberg-asso.fr:31022/gutenberg/faq-gut.git
cd faq-gut
python3 -m venv .venv
source .venv/bin/activate
.venv/bin/pip install --no-cache-dir -U pip
.venv/bin/pip install --no-cache-dir \
      Sphinx                         \
      Pillow                         \
      sphinx_comments                \
      sphinx_design                  \
      sphinxext.opengraph            \
      myst_parser                    \
      linkify-it-py                   \
      sphinx_tippy                    \
      sphinx_sitemap                  \
      pydata_sphinx_theme             \
      sphinx_copybutton               \
      sphinx_togglebutton             \
      sphinx_examples                 \
      sphinx_last_updated_by_git
git clone https://gitlab.gutenberg-asso.fr/dbitouze/pygments-acetexlexer.git
# or:
# git clone ssh://git@gitlab.gutenberg-asso.fr:31022/dbitouze/pygments-acetexlexer.git
cd pygments-acetexlexer
../.venv/bin/pip install --no-cache-dir .
cd ..
rm -rf pygments-acetexlexer
make linkcheck

jayaddison commented 8 months ago

Hi @dbitouze - when you run the build, do you see URLs and status codes (ok, broken, redirected, ...) printed out?

During local testing, I was able to replicate a never-exiting linkcheck build - seemingly due to server-held connections remaining open. Configuring a linkcheck_timeout value resolved that.

(please note: links that timeout during checking are currently reported as broken - personally I think we should probably introduce an additional linkcheck status code to allow users to distinguish those from error/not-found links)

dbitouze commented 8 months ago

Hi @jayaddison,

when you run the build, do you see URLs and status codes (ok, broken, redirected, ...) printed out?

Yes.

During local testing, I was able to replicate a never-exiting linkcheck build - seemingly due to server-held connections remaining open. Configuring a linkcheck_timeout value resolved that.

Okay, but I'm not sure I understand what this timeout applies to: each verified link or the entire link check? In the latter case, we'd be missing other unverified links, wouldn't we? In any case, what would you recommend as a timeout?

(please note: links that timeout during checking are currently reported as broken - personally I think we should probably introduce an additional linkcheck status code to allow users to distinguish those from error/not-found links)

Would be nice!

Thanks!

jayaddison commented 8 months ago

During local testing, I was able to replicate a never-exiting linkcheck build - seemingly due to server-held connections remaining open. Configuring a linkcheck_timeout value resolved that.

Okay, but I'm not sure I understand what this timeout applies to: each verified link or the entire link check? In the latter case, we'd be missing other unverified links, wouldn't we? In any case, what would you recommend as a timeout?

Timeouts can be complicated, but roughly speaking, and since it isn't (yet!) well-documented: linkcheck_timeout is a per-hyperlink timeout, in seconds, that indicates how long the linkcheck builder will wait for a response from a webserver that it is connected to.

I'd recommend a value somewhere in the range of 1 to 15 seconds typically. If you experience (or expect to experience) high latency to the webservers you're checking, you may want to increase the value, or alternatively if you want linkchecking to succeed more quickly and don't care so much about potentially-false broken link reports, you may want to decrease it.

dbitouze commented 8 months ago

I tried with linkcheck_timeout = 15 and the make linkcheck process finally ended (thanks!), but with the two last lines in the terminal being:

build finished with problems, 35 warnings. make: *** [Makefile:20: linkcheck] Error 1

Apart from this,

there are not so many (287) lines in output.txt,
the process ended relatively quickly.

jayaddison commented 8 months ago

Thanks @dbitouze - I think that the output is as-expected; an exit code of 1 is returned if the linkchecker finds any broken links, and it emits warnings for broken links and unexpected redirects.

AA-Turner commented 8 months ago

Ok, closing on the basis of this seems fine.

@jayaddison -- do you think we should add a default timeout of say 30s? Can track in a new issue if you agree.

A

jayaddison commented 7 months ago

@AA-Turner yep, I'd been wondering about whether to set a non-zero default timeout too. After some more thought, yes, I think it does make sense too (I'll open that issue in a few moments).

It could result in a few additional cases of confusion until #11868 arrives, but that confusion is already possible in cases where link check timeouts have been configured - so I don't think that that issue should be a blocker for it.

dbitouze commented 7 months ago

I think that the output is as-expected; an exit code of 1 is returned if the linkchecker finds any broken links, and it emits warnings for broken links and unexpected redirects.

IMHO:

make: *** [Makefile:20: linkcheck] Error 1

could let the user think a (maybe) fatal error happened during the process. Wouldn't be possible to emit a more explicit and less worrying message?

jayaddison commented 7 months ago

I think that the output is as-expected; an exit code of 1 is returned if the linkchecker finds any broken links, and it emits warnings for broken links and unexpected redirects.

IMHO:

make: *** [Makefile:20: linkcheck] Error 1

could let the user think a (maybe) fatal error happened during the process. Wouldn't be possible to emit a more explicit and less worrying message?

The output of make can be quite terse, yep -- currently the way that the linkchecker communicates success/failure is by reporting a status code; zero by convention for success (all's well with the checked links), or non-zero to indicate to the parent script / user that a problem occurred.

(similar to the way that the diff command communicates when no differences are found, or non-zero otherwise)

@dbitouze could you explain a bit more about the use-case? It's giving me a few ideas, although perhaps some more information is required to figure it out.

dbitouze commented 7 months ago

@dbitouze could you explain a bit more about the use-case? It's giving me a few ideas, although perhaps some more information is required to figure it out.

Our use case is just to detect broken links in our more than 1200 pages, some of them written ages ago.

It's strange that a script designed for detecting failures is said to be in error when it properly did its jobs :wink:

jayaddison commented 7 months ago

@dbitouze could you explain a bit more about the use-case? It's giving me a few ideas, although perhaps some more information is required to figure it out.

Our use case is just to detect broken links in our more than 1200 pages, some of them written ages ago.

It's strange that a script designed for detecting failures is said to be in error when it properly did its jobs 😉

I think it makes sense when the goal of a documentation project is to ensure that all links are valid at build-time, and also there's enough capacity on the team to handle broken links if-and-when they occur. In that kind of situation, you might indeed want the continuous integration (that runs make, for example) to show an error when a broken link is detected. I've always liked the phrase "cool URIs don't change" and I think that good websites follow that principle.

dbitouze commented 7 months ago

It's strange that a script designed for detecting failures is said to be in error when it properly did its jobs 😉

I think it makes sense when the goal of a documentation project is to ensure that all links are valid at build-time, and also there's enough capacity on the team to handle broken links if-and-when they occur. In that kind of situation, you might indeed want the continuous integration (that runs make, for example) to show an error when a broken link is detected. I've always liked the phrase "cool URIs don't change" and I think that good websites follow that principle.

That makes sense, indeed.

sphinx-doc / sphinx