sphinx-doc / sphinx

The Sphinx documentation generator
https://www.sphinx-doc.org/
Other
6.19k stars 2.03k forks source link

BUG: env.toctree_includes different between parallel and serial builds #12409

Open larsoner opened 1 month ago

larsoner commented 1 month ago

Describe the bug

In SciPy we had a bug where using pydata-sphinx-theme in serial builds was fine, but in parallel builds a jinja context did some incorrect HTML replacement. The short version is that the PST function looks for the ancestors of a given page using _get_toctree_ancestors(env.toctree_includes, pagename) and decides based on the ancestors what HTML to show.

In the correct-behavior case, which always happens in serial mode and sometimes happens in parallel mode, for the reference/cluster.hierarchy page we get what you'd expect:

ancestors=['reference/cluster.hierarchy', 'reference/cluster', 'reference/index']

In the incorrect case, which happens non-deterministically in parallel builds only, we get:

ancestors=['reference/cluster.hierarchy', 'reference/cluster', 'reference/index', 'dev/contributor/contributor_toc', 'dev/index']

You can see the addition of 'dev/contributor/contributor_toc' -- this isn't entirely unreasonable because reference/index is added to a hidden toctree of that page. But it does deviate from the serial behavior to have those in there.

I wonder if it's perhaps due to when the forked reading subprocesses return, and which env gets merged into which (and what takes precedence). There is also some setdefault code that could be involved. I'm not sure 🤷

How to Reproduce

After following SciPy's dev docs:

python dev.py doc -j6

or if you don't want it to build but have SciPy and pydata-sphinx-theme installed, clone SciPy main (before https://github.com/scipy/scipy/pull/20897 gets merged, like on https://github.com/scipy/scipy/commit/9d543aec2fefdf14daaae459ca00e3d03af61ac5) and then enter the doc dir and do:

python -msphinx -WT --keep-going -b html -d build/doctrees -E source build/html -j6

it will complain about bad refs and stuff but then if you inspect the output you might see the incorrect behavior (left) or the correct behavior (right) 🎲 :

Incorrect Correct
Screenshot 2024-06-04 at 12 31 30 PM Screenshot 2024-06-04 at 12 16 30 PM

Environment Information

Tried on latest `master` and Sphinx 5.3. Replicated on macOS but originally seen on Linux in CIs. Tried with Python 3.11 and 3.12, etc.

Sphinx extensions

I commented out a few of them to speed up the build but ultimately used:

    'sphinx.ext.autodoc',
    'sphinx.ext.autosummary',
    'sphinx.ext.coverage',
    'sphinx.ext.mathjax',
    'sphinx.ext.intersphinx',
    'numpydoc',
    'sphinx_design',
    'doi_role',
    'matplotlib.sphinxext.plot_directive',

Additional context

No response

picnixz commented 1 month ago

Tried on latest master and Sphinx 5.3.

So the bug has been since 5.3? I'm really surprised that no one noticed... Just wondering but.. what happens if you try using Python 3.9 and Python 3.10? If the same happens, I think we might have an issue with the merge environment process. By the way, does it also happen using another theme?

I'd like a MWE (with at least 10 documents because otherwise you cannot really trigger the parallel build) which does not use any custom extensions so that I can know which are the side-effects.

drammock commented 1 month ago

So the bug has been since 5.3?

5.3 and latest master are the only sphinx versions we tested (5.3 is lowest supported by pydata sphinx theme). Problem may occur with even earlier sphinx versions, we don't know.

what happens if you try using Python 3.9 and Python 3.10

I've repro'd it locally with 3.9 before, never expressly tried 3.10 I don't think

does it also happen using another theme?

Not expected to occur with other themes, because as mentioned:

the PST function looks for the ancestors of a given page using _get_toctree_ancestors(env.toctree_includes, pagename) and decides based on the ancestors what HTML to show.

... and most themes don't do that (at least none that I'm aware of). This is also probably why nobody noticed the problem before. Indeed so far it is only SciPy that has reported this behavior, most likely because their docs site is quite large; many other smaller sites using the PyData theme build in parallel and haven't encountered this.