Closed dbitouze closed 8 months ago
Does it work with less files? can we know which link is being tested? Since the CI/CD works, it's impossible for us to investigate more unless we know what to look for.
Does it work with less files?
With less (112) files, it almost works. Run with the -vv
verbose option, I get:
[...]
[app] emitting event: 'build-finished'(None,)
sphinx-sitemap: No pages generated for sitemap.xml
build finished with problems, 371 warnings.
make: *** [Makefile:20: linkcheck] Error
can we know which link is being tested?
Sorry, I don't see what you mean.
Since the CI/CD works, it's impossible for us to investigate more unless we know what to look for.
I can imagine, sorry for not being able to provide more informations.
Without a repository to reproduce the error, I'm forced to close -- sorry.
A
Without a repository to reproduce the error, I'm forced to close -- sorry.
Ah, sorry, I didn't know that I could just let you clone our repository for testing. Here is are the steps to reproduce the error:
git clone https://gitlab.gutenberg-asso.fr/gutenberg/faq-gut.git
# or:
# git clone ssh://git@gitlab.gutenberg-asso.fr:31022/gutenberg/faq-gut.git
cd faq-gut
python3 -m venv .venv
source .venv/bin/activate
.venv/bin/pip install --no-cache-dir -U pip
.venv/bin/pip install --no-cache-dir \
Sphinx \
Pillow \
sphinx_comments \
sphinx_design \
sphinxext.opengraph \
myst_parser \
linkify-it-py \
sphinx_tippy \
sphinx_sitemap \
pydata_sphinx_theme \
sphinx_copybutton \
sphinx_togglebutton \
sphinx_examples \
sphinx_last_updated_by_git
git clone https://gitlab.gutenberg-asso.fr/dbitouze/pygments-acetexlexer.git
# or:
# git clone ssh://git@gitlab.gutenberg-asso.fr:31022/dbitouze/pygments-acetexlexer.git
cd pygments-acetexlexer
../.venv/bin/pip install --no-cache-dir .
cd ..
rm -rf pygments-acetexlexer
make linkcheck
Hi @dbitouze - when you run the build, do you see URLs and status codes (ok
, broken
, redirected
, ...) printed out?
During local testing, I was able to replicate a never-exiting linkcheck
build - seemingly due to server-held connections remaining open. Configuring a linkcheck_timeout
value resolved that.
(please note: links that timeout during checking are currently reported as broken
- personally I think we should probably introduce an additional linkcheck
status code to allow users to distinguish those from error/not-found links)
Hi @jayaddison,
when you run the build, do you see URLs and status codes (
ok
,broken
,redirected
, ...) printed out?
Yes.
During local testing, I was able to replicate a never-exiting
linkcheck
build - seemingly due to server-held connections remaining open. Configuring alinkcheck_timeout
value resolved that.
Okay, but I'm not sure I understand what this timeout applies to: each verified link or the entire link check? In the latter case, we'd be missing other unverified links, wouldn't we? In any case, what would you recommend as a timeout?
(please note: links that timeout during checking are currently reported as
broken
- personally I think we should probably introduce an additionallinkcheck
status code to allow users to distinguish those from error/not-found links)
Would be nice!
Thanks!
During local testing, I was able to replicate a never-exiting
linkcheck
build - seemingly due to server-held connections remaining open. Configuring alinkcheck_timeout
value resolved that.Okay, but I'm not sure I understand what this timeout applies to: each verified link or the entire link check? In the latter case, we'd be missing other unverified links, wouldn't we? In any case, what would you recommend as a timeout?
Timeouts can be complicated, but roughly speaking, and since it isn't (yet!) well-documented: linkcheck_timeout
is a per-hyperlink timeout, in seconds, that indicates how long the linkcheck
builder will wait for a response from a webserver that it is connected to.
I'd recommend a value somewhere in the range of 1 to 15 seconds typically. If you experience (or expect to experience) high latency to the webservers you're checking, you may want to increase the value, or alternatively if you want linkchecking to succeed more quickly and don't care so much about potentially-false broken link reports, you may want to decrease it.
I tried with linkcheck_timeout = 15
and the make linkcheck
process finally ended (thanks!), but with the two last lines in the terminal being:
build finished with problems, 35 warnings. make: *** [Makefile:20: linkcheck] Error 1
Apart from this,
output.txt
,Thanks @dbitouze - I think that the output is as-expected; an exit code of 1 is returned if the linkchecker finds any broken links, and it emits warnings for broken links and unexpected redirects.
Ok, closing on the basis of this seems fine.
@jayaddison -- do you think we should add a default timeout of say 30s? Can track in a new issue if you agree.
A
@AA-Turner yep, I'd been wondering about whether to set a non-zero default timeout too. After some more thought, yes, I think it does make sense too (I'll open that issue in a few moments).
It could result in a few additional cases of confusion until #11868 arrives, but that confusion is already possible in cases where link check timeouts have been configured - so I don't think that that issue should be a blocker for it.
I think that the output is as-expected; an exit code of 1 is returned if the linkchecker finds any broken links, and it emits warnings for broken links and unexpected redirects.
IMHO:
make: *** [Makefile:20: linkcheck] Error 1
could let the user think a (maybe) fatal error happened during the process. Wouldn't be possible to emit a more explicit and less worrying message?
I think that the output is as-expected; an exit code of 1 is returned if the linkchecker finds any broken links, and it emits warnings for broken links and unexpected redirects.
IMHO:
make: *** [Makefile:20: linkcheck] Error 1
could let the user think a (maybe) fatal error happened during the process. Wouldn't be possible to emit a more explicit and less worrying message?
The output of make
can be quite terse, yep -- currently the way that the linkchecker communicates success/failure is by reporting a status code; zero by convention for success (all's well with the checked links), or non-zero to indicate to the parent script / user that a problem occurred.
(similar to the way that the diff
command communicates when no differences are found, or non-zero otherwise)
@dbitouze could you explain a bit more about the use-case? It's giving me a few ideas, although perhaps some more information is required to figure it out.
@dbitouze could you explain a bit more about the use-case? It's giving me a few ideas, although perhaps some more information is required to figure it out.
Our use case is just to detect broken links in our more than 1200 pages, some of them written ages ago.
It's strange that a script designed for detecting failures is said to be in error when it properly did its jobs :wink:
@dbitouze could you explain a bit more about the use-case? It's giving me a few ideas, although perhaps some more information is required to figure it out.
Our use case is just to detect broken links in our more than 1200 pages, some of them written ages ago.
It's strange that a script designed for detecting failures is said to be in error when it properly did its jobs 😉
I think it makes sense when the goal of a documentation project is to ensure that all links are valid at build-time, and also there's enough capacity on the team to handle broken links if-and-when they occur. In that kind of situation, you might indeed want the continuous integration (that runs make
, for example) to show an error when a broken link is detected. I've always liked the phrase "cool URIs don't change" and I think that good websites follow that principle.
It's strange that a script designed for detecting failures is said to be in error when it properly did its jobs 😉
I think it makes sense when the goal of a documentation project is to ensure that all links are valid at build-time, and also there's enough capacity on the team to handle broken links if-and-when they occur. In that kind of situation, you might indeed want the continuous integration (that runs
make
, for example) to show an error when a broken link is detected. I've always liked the phrase "cool URIs don't change" and I think that good websites follow that principle.
That makes sense, indeed.
Describe the bug
The execution of
make linkcheck
never ends: the only way to stop it is Ctrl+C.How to Reproduce
Not sure how to reproduce: we have about 1200
.md
source files.Environment Information
Sphinx extensions
Additional context
No response