Closed coruscating closed 1 year ago
Some more rough Furo build times:
Autosummary is definitely a big problem, and the HTML pages aren't even being built since disabling all other extensions broke them, so the time taken to populate stubs/
is the bottleneck. Setting autosummary_generate_overwrite = False
didn't affect the time either. I'm surprised changing the theme would have such a big effect on writing the stubs themselves.
I'm pretty sure this is because of Furo's warning in https://github.com/pradyunsg/furo/discussions/227#discussioncomment-1249750
The entire site structure is presented in the sidebar -- if you have a big documentation set, all your pages get correspondingly large as well.
Unlike Pytorch, Furo puts every single subpage in the left ToC. This hypothesis is consistent with Helena's research above -- autosummary results in substantially more subpages being created.
I think we have two options:
maxdepth
option so we don't render the whole ToC. This would look like re-overriding the HTML context's furo_table_of_contents
to have a different maxdepth, after Furo has already set it:I'm definitely in favor of trimming the ToC to reduce page size. As for the dedicated pages, there definitely are a lot of function and attribute pages that are basically stubs and not worth being rendered on their own page. But I'm wondering if this might lead to complex and well-documented classes/modules generating very long pages.
@coruscating found some great data: https://github.com/Qiskit/qiskit_sphinx_theme/pull/451#issuecomment-1618893962. The Pytorch theme is now building 5x slower on 1.13 than 1.12! I'm going to git bisect
to find the culprit. navigation_depth
is relevant to Furo's slow performance per her benchmarks, but it's not the sole explanation.
Do you have an example documentation set for me to look at, that produces these slow behaviours?
Thanks @pradyunsg for offering to take a look! Check out https://github.com/Qiskit/qiskit-metapackage/suites/13985744654/artifacts/780412210. You'll notice the file size bloats to about 700-900kb and the build takes 3.5 hours vs. 30 minutes for Pytorch, with around 85-90% of the HTML being the left side bar.
Again, we know we need to fix this by not having dedicated HTML pages per function & tutorial. But even still, we will have a huge amount of docs because the Qiskit API is so large.
Yea, the most of that toctree is basically your autosummary documentation. I think the first thing I'd try to do is use https://pypi.org/project/sphinx-remove-toctrees/ with:
remove_from_toctrees = ["stubs/*"]
That should handle the toctree problem that you have.
FWIW, it would be useful to have the sources and instructions to building documentation with those sources -- I'm interested in profiling where in Furo's own codebase is the hot-loop/ slowness.
Yea, the most of that toctree is basically your autosummary documentation.
100%. I believe the time complexity is O(n^2): for every n page, we rerun get_navigation_tree
which processes n <li>
elements, one <li>
per page. I see the code has functools.lru_cache
, but I'm not sure the toctree_html
is the same input every time?
One angle I haven't explored enough yet is if it get_navigation_tree
can be optimized.
I think the first thing I'd try to do is use https://pypi.org/project/sphinx-remove-toctrees/ with:
Good idea to benchmark removing all stubs. We only removed methods, which took us from 3.5 hours to 1.5 hours. See https://github.com/Qiskit/qiskit-metapackage/pull/1770.
Although, sphinx-remove-toctree
does not highlight the parent page when you're on a subpage, unlike what happens with navigation_depth
. You can see this in the docs build from my last comment by navigating to a method page (API Reference -> Quantum Circuits -> QuantumCircuit -> scroll down to methods table). So, the user has no indication in the left sidebar where they are. sphinx-remove-toctrees
is useful to debug, but not acceptable for us to use in production.
FWIW, it would be useful to have the sources and instructions to building documentation with those sources -- I'm interested in profiling where in Furo's own codebase is the hot-loop/ slowness.
Ah, pardon. It's this repo: https://github.com/Qiskit/qiskit-metapackage and this PR: https://github.com/Qiskit/qiskit-metapackage/pull/1767. We use some custom directives so it won't work to directly use furo
instead of qiskit-sphinx-theme
.
Then, tox -e docs
. But warning it's incredibly slow, even on Pytorch, hence why we want to reorganize the API docs regardless of Furo.
It is likely much faster and less frustrating to artificially generate a test site. Afaict, the important thing is # of HTML pages, not the content on those pages. Qiskit has this 4527 pages:
Also, we do a weird thing of Git cloning two other repos for the docs build, qiskit-tutorials
and qiskit-terra
(we're actively migrating away from this). A Sphinx extension docs/custom_extensions.py
handles that. You can speed up the build by disabling tutorials with this diff:
diff --git a/docs/conf.py b/docs/conf.py
index a16a69b..56959f7 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -251,7 +251,7 @@ def trim_toctree(_: sphinx.application.Sphinx, env) -> None:
def setup(app):
custom_extensions.load_terra_docs(app)
- custom_extensions.load_tutorials(app)
+ # custom_extensions.load_tutorials(app)
versionutils.setup(app)
app.connect("build-finished", custom_extensions.clean_docs)
app.connect("env-updated", trim_toctree)
diff --git a/docs/index.rst b/docs/index.rst
index 7a4392f..9c13dbc 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -64,7 +64,6 @@ Main Qiskit-related projects
qc_intro
getting_started
intro_tutorial1
- tutorials
API Reference <apidoc/terra>
Migration Guides <migration_guides/index>
release_notes
diff --git a/docs/tutorials.rst b/docs/tutorials.rst
deleted file mode 100644
index 40d6361..0000000
--- a/docs/tutorials.rst
+++ /dev/null
@@ -1,69 +0,0 @@
-:orphan:
-
-.. _tutorials:
-
-=========
-Tutorials
-=========
-
-Introductory
-============
-
-.. qiskit-card::
- :header: Qiskit warmup
- :card_description: An introduction to Qiskit and the primary user workflow.
- :image: _static/images/logo.png
- :link: intro_tutorial1.html
-
-Quantum circuits
-================
-
-.. nbgallery::
- :glob:
-
- tutorials/circuits/*
-
-Advanced circuits
-=================
-
-.. nbgallery::
- :glob:
-
- tutorials/circuits_advanced/*
-
-Classical simulators
-====================
-
-.. nbgallery::
- :glob:
-
- tutorials/simulators/*
-
-Algorithms
-==========
-
-.. nbgallery::
- :glob:
-
- tutorials/algorithms/*
-
-Operators
-=========
-
-.. nbgallery::
- :glob:
-
- tutorials/operators/*
-
-Sample algorithms in Qiskit
-===========================
-
-.. nbgallery::
- :glob:
-
- tutorials/textbook/*
-
-.. Hiding - Indices and tables
- :ref:`genindex`
- :ref:`modindex`
- :ref:`search`
Although, sphinx-remove-toctree does not highlight the parent page when you're on a subpage, unlike what happens with navigation_depth.
Ugh, yea, that makes sense. :/
@pradyunsg good news! Your suggestion of using sphinx-remove-toctree
to remove all API docs results in a 21-minute CI build, rather than 3.5 hours. Compared to 23 minutes for Pytorch on our latest PR. Meaning, I believe all the slowdown is coming from the left sidebar; removing that variable, Furo is in the same ballpark / slightly faster than Pytorch 🎉
So, maybe you'd be open to something like this?
navigation_depth
to Furo so they're as fast as possible (https://github.com/pradyunsg/furo/pull/674). We'd use the environment variables approach I mention there to only impact dev builds. We can redo that PR to e.g. call the option dev_only_navigation_depth
?That is, #674 is solely intended as a tool for dev builds, not prod builds.
Additionally, maybe it's possible to optimize get_navigation_tree
in Furo, but that's not guaranteed to work.
Thoughts?
we add
navigation_depth
to Furo so they're as fast as possible (https://github.com/pradyunsg/furo/pull/674)
If you're not comfortable merging #674, then I think we'll use sphinx-remove-toctree
for dev builds to get a similar speedup. But #674 would be great because it's:
navigation_depth=2
or 3 since it's "fast enough" and makes the dev build closer to prod. We can't get that precision with sphinx-remove-toctree
because it doesn't have the concept of nesting levels.We'd love to make it easier to help other Furo users facing similar build speed issues.
@pradyunsg Is it possible to introduce the option of rendering the sidebar as a separate asset? It would solve a lot of problems for us if the sidebar can be cached user-side. Here's some toy code for conf.py
that writes _static/sidebar.html
once with the toctree:
def setup(app):
...
app.connect("html-page-context", write_sidebar)
def write_sidebar(app, pagename, templatename, context, doctree):
# only run once
if pagename != "index":
return
sidebar_html = context["toctree"](
maxdepth=-1, collapse=True, titles_only=True, includehidden=True
)
output_path = join(app.builder.outdir, "_static", "sidebar.html")
with open(output_path, "w") as f:
# using Furo's specific tree modifications
f.write(get_navigation_tree(sidebar_html))
And then navigation.html
can be overridden to load content from _static/sidebar.html
instead of context["furo_navigation_tree"]
. Of course more Javascript would be needed on top of this, for example to highlight the current page in the toctree.
To close the loop, we were able to get acceptable speeds by
1) reorganizing our API docs to have fewer pages, and 2) distinguishing between dev vs prod builds. See https://github.com/Qiskit/qiskit/pull/10624.
Interestingly, viewcode results in a 14 minute slowdown for Qiskit when using Furo! I think it was only about a 30 second slowdown when using the old Pytorch theme.
@pradyunsg Is it possible to introduce the option of rendering the sidebar as a separate asset?
I'd be open to that, as an opt-in option. I'm wary of the complexity of this solution somewhat -- but we can investigate that separately perhaps.
I timed the Qiskit Experiments sphinx build after an identical previous build has already run and without running any jupyter-sphinx code, so the only tasks should be checking that there are no updates to the code and writing output pages. It's ~70 seconds with the current theme, but ~1700 seconds with the Furo build. The main bottleneck is in the writing output and highlighting module code phases.