pydata / pydata-sphinx-theme

A clean, three-column Sphinx theme with Bootstrap for the PyData community
https://pydata-sphinx-theme.readthedocs.io
BSD 3-Clause "New" or "Revised" License
585 stars 306 forks source link

Figure out why pydata-sphinx-theme is parallel-write-unsafe, and fix it #1643

Closed drammock closed 2 months ago

drammock commented 7 months ago

In working on https://github.com/scipy/scipy/pull/16660 I discovered that our theme is parallel-write unsafe (see esp. this comment, and the PR description of #1642 where I mark the theme as parallel-write unsafe).

This is a reminder to try to figure out why, and hopefully fix it so that we can remove the parallel-write-unsafe flag.

RobPasMue commented 7 months ago

This is a rather unpleasant issue on our side because we are forcing warnings to raise as errors (in order to properly maintain our docs). Having to switch to single-threaded mode is also inconvenient. It'd be great if we could get this fix soon in the next release 😄! Upvoting the issue!!

stinodego commented 6 months ago

I ran into this when trying to upgrade from version 0.14.1 to version 0.15.2. We use Sphinx version 7.2.4.

Eagerly awaiting a fix so that we may upgrade to the latest version of the theme!

drammock commented 6 months ago

@RobPasMue and @stinodego can you tell us how big of an impact this is having for you? I.e., for the project(s) you're working on, how long is the doc build with -j1 and how long with -jN (for whatever N is typical in your workflow)? I'm trying to gauge how to prioritize this; is it "single-threaded builds is a crippling non-starter" or more like "single-threaded builds is inconvenient"?

For context, the changes that precipitated the parallel-write-unsafe marking were discovered when building the SciPy docs (which are fairly large) and even there the single-threaded build is only around ~12 minutes (which in the context of local dev/docs work is bad, but for CI servers on pull requests not so bad). Without the change, SciPy was seeing build times of ~30 minutes (even with -j2). It would be helpful to have similar timing numbers for other projects.

stinodego commented 6 months ago

I just ran a quick local benchmark with -j auto vs no parallelization. With parallelization it's 2:37 to build the docs, without it's 11:14.

So that's quite significant. In practice, that goes from "let me quickly build the docs to see the effect of this change" to "let's just hope it renders correctly".

At this point, we will likely wait for a new version that supports parallization, rather than upgrading to a version without parallelization that may have some nice new features. I think a lot of other projects will have parallel building enabled as well (free performance) and will run into this limitation. So I would say it's quite a critical fix.

For reference, I work on Polars, and we've been happy users of the theme for a long time now! https://github.com/pola-rs/polars

RobPasMue commented 6 months ago

Hi @drammock - In my case I work with the PyAnsys ecosystem, where we are consumers of the pydata sphinx theme for over 3 years. I gave it a try with out library PyAnsys Geometry https://github.com/ansys/pyansys-geometry (without building the examples with sphinx gallery) and the results were as follows.

Using:

Sphinx==7.2.6
pydata-sphinx-theme==0.14.4

With parallelization (-j auto): 1:01 No parallelization (-j1): 2:00

Our build times are not that big in any case, but the impact of parallelization is significant. However, the main issue we are facing is that we enable parallelization by default throughout the ecosystem. And we are forcing warnings to behave as errors so that we ensure a proper build process. We have a wrapped version of the pydata sphinx theme (i.e. https://github.com/ansys/ansys-sphinx-theme) which we have tailored for our brand. We had to limit the version to the latest one with parallelization but as @stinodego mentioned, we would love to still use the latest and greatest features you guys are working on.

cosenal commented 5 months ago

Adding another datapoint about the impact of this issue for the https://github.com/unitaryfund/mitiq project.

We compile Jupyter notebooks with Myst-NB when generating docs, and it turns out, such task is quite parallelizable. The speed up for a sphinx-build from a clean env with parallel mode (-j auto) is ~2.5x on my machine, see description of https://github.com/unitaryfund/mitiq/pull/2235.

Just like the other comments in this thread, we also have strict failure on warnings, so pydata-sphinx-theme not being parallel-write-safe prevents us from moving away from a serial build.

drammock commented 5 months ago

thanks all for the input, and apologies for the radio silence. Clearly this is a high priority for many theme users. As is probably obvious, my availability for work on the theme has dropped off in recent few months, so I have a suggestion to move this forward: we're only aware of parallel-build problems from one theme user (SciPy). Since SciPy already knows about this and has set their CI builds to disable parallelism, we could in theory remove the parallel_write_unsafe flag from the theme so that the rest of you can enjoy parallelism again --- with the understanding that you should watch out for odd behavior (what SciPy saw was the left sidebar contents sometimes not matching the top-level section of the docs that the page was in).

Shall we do a 👍🏻 / 👎🏻 vote on re-enabling parallel writes? In the meantime, I'll point out the help wanted label on this issue, in case anyone wants to try their hand at figuring out why SciPy's builds get messed up; I'll gladly take a bit of time to get someone up to speed if they're willing to work on it.

cc @tupui

dbitouze commented 5 months ago

(what SciPy saw was the left sidebar contents sometimes not matching the top-level section of the docs that the page was in)

What would be nice is if other strange behaviors noticed by people were reported somewhere (here?) so that others could check whether these behaviors also affect their sites.

cosenal commented 5 months ago

I can try to install the theme package from git and to set parallel_write_unsafe flag to False in my environment, before you officially disable it in the release. I will report back on the behaviors.

Update: I built an editable of the pydata-sphinx-theme package in my env and there is no difference in the look of our docs when I set the parallel_write_unsafe flag to either True or False.

@drammock You have my 👍 for flipping the flag in the official release of the theme.