Qiskit / qiskit_sphinx_theme

A Sphinx theme and documentation guidelines for Qiskit and Qiskit Ecosystem projects
https://qisk.it/docs-guide
Apache License 2.0
16 stars 29 forks source link

Furo: investigate slow builds #328

Closed coruscating closed 1 year ago

coruscating commented 1 year ago

I timed the Qiskit Experiments sphinx build after an identical previous build has already run and without running any jupyter-sphinx code, so the only tasks should be checking that there are no updates to the code and writing output pages. It's ~70 seconds with the current theme, but ~1700 seconds with the Furo build. The main bottleneck is in the writing output and highlighting module code phases.

coruscating commented 1 year ago

Some more rough Furo build times:

Autosummary is definitely a big problem, and the HTML pages aren't even being built since disabling all other extensions broke them, so the time taken to populate stubs/ is the bottleneck. Setting autosummary_generate_overwrite = False didn't affect the time either. I'm surprised changing the theme would have such a big effect on writing the stubs themselves.

Eric-Arellano commented 1 year ago

I'm pretty sure this is because of Furo's warning in https://github.com/pradyunsg/furo/discussions/227#discussioncomment-1249750

The entire site structure is presented in the sidebar -- if you have a big documentation set, all your pages get correspondingly large as well.

Unlike Pytorch, Furo puts every single subpage in the left ToC. This hypothesis is consistent with Helena's research above -- autosummary results in substantially more subpages being created.

I think we have two options:

  1. Stop having a dedicated page per method and function. Instead, host them on the class and module page. Regardless of Furo, this would speed up docs builds a lot, according to past benchmarking by Jake Lishman.
  2. Override Furo to respect maxdepth option so we don't render the whole ToC. This would look like re-overriding the HTML context's furo_table_of_contents to have a different maxdepth, after Furo has already set it:

https://github.com/pradyunsg/furo/blob/80afa27a9a7bd0ac9259636239e349656c4726ab/src/furo/__init__.py#L115

coruscating commented 1 year ago

I'm definitely in favor of trimming the ToC to reduce page size. As for the dedicated pages, there definitely are a lot of function and attribute pages that are basically stubs and not worth being rendered on their own page. But I'm wondering if this might lead to complex and well-documented classes/modules generating very long pages.

Eric-Arellano commented 1 year ago

@coruscating found some great data: https://github.com/Qiskit/qiskit_sphinx_theme/pull/451#issuecomment-1618893962. The Pytorch theme is now building 5x slower on 1.13 than 1.12! I'm going to git bisect to find the culprit. navigation_depth is relevant to Furo's slow performance per her benchmarks, but it's not the sole explanation.

pradyunsg commented 1 year ago

Do you have an example documentation set for me to look at, that produces these slow behaviours?

Eric-Arellano commented 1 year ago

Thanks @pradyunsg for offering to take a look! Check out https://github.com/Qiskit/qiskit-metapackage/suites/13985744654/artifacts/780412210. You'll notice the file size bloats to about 700-900kb and the build takes 3.5 hours vs. 30 minutes for Pytorch, with around 85-90% of the HTML being the left side bar.

Again, we know we need to fix this by not having dedicated HTML pages per function & tutorial. But even still, we will have a huge amount of docs because the Qiskit API is so large.

pradyunsg commented 1 year ago

Yea, the most of that toctree is basically your autosummary documentation. I think the first thing I'd try to do is use https://pypi.org/project/sphinx-remove-toctrees/ with:

remove_from_toctrees = ["stubs/*"]

That should handle the toctree problem that you have.

FWIW, it would be useful to have the sources and instructions to building documentation with those sources -- I'm interested in profiling where in Furo's own codebase is the hot-loop/ slowness.

Eric-Arellano commented 1 year ago

Yea, the most of that toctree is basically your autosummary documentation.

100%. I believe the time complexity is O(n^2): for every n page, we rerun get_navigation_tree which processes n <li> elements, one <li> per page. I see the code has functools.lru_cache, but I'm not sure the toctree_html is the same input every time?

One angle I haven't explored enough yet is if it get_navigation_tree can be optimized.

I think the first thing I'd try to do is use https://pypi.org/project/sphinx-remove-toctrees/ with:

Good idea to benchmark removing all stubs. We only removed methods, which took us from 3.5 hours to 1.5 hours. See https://github.com/Qiskit/qiskit-metapackage/pull/1770.

Although, sphinx-remove-toctree does not highlight the parent page when you're on a subpage, unlike what happens with navigation_depth. You can see this in the docs build from my last comment by navigating to a method page (API Reference -> Quantum Circuits -> QuantumCircuit -> scroll down to methods table). So, the user has no indication in the left sidebar where they are. sphinx-remove-toctrees is useful to debug, but not acceptable for us to use in production.

FWIW, it would be useful to have the sources and instructions to building documentation with those sources -- I'm interested in profiling where in Furo's own codebase is the hot-loop/ slowness.

Ah, pardon. It's this repo: https://github.com/Qiskit/qiskit-metapackage and this PR: https://github.com/Qiskit/qiskit-metapackage/pull/1767. We use some custom directives so it won't work to directly use furo instead of qiskit-sphinx-theme.

Then, tox -e docs. But warning it's incredibly slow, even on Pytorch, hence why we want to reorganize the API docs regardless of Furo.

It is likely much faster and less frustrating to artificially generate a test site. Afaict, the important thing is # of HTML pages, not the content on those pages. Qiskit has this 4527 pages:

Also, we do a weird thing of Git cloning two other repos for the docs build, qiskit-tutorials and qiskit-terra (we're actively migrating away from this). A Sphinx extension docs/custom_extensions.py handles that. You can speed up the build by disabling tutorials with this diff:

diff --git a/docs/conf.py b/docs/conf.py
index a16a69b..56959f7 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -251,7 +251,7 @@ def trim_toctree(_: sphinx.application.Sphinx, env) -> None:

 def setup(app):
     custom_extensions.load_terra_docs(app)
-    custom_extensions.load_tutorials(app)
+    # custom_extensions.load_tutorials(app)
     versionutils.setup(app)
     app.connect("build-finished", custom_extensions.clean_docs)
     app.connect("env-updated", trim_toctree)
diff --git a/docs/index.rst b/docs/index.rst
index 7a4392f..9c13dbc 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -64,7 +64,6 @@ Main Qiskit-related projects
    qc_intro
    getting_started
    intro_tutorial1
-   tutorials
    API Reference <apidoc/terra>
    Migration Guides <migration_guides/index>
    release_notes
diff --git a/docs/tutorials.rst b/docs/tutorials.rst
deleted file mode 100644
index 40d6361..0000000
--- a/docs/tutorials.rst
+++ /dev/null
@@ -1,69 +0,0 @@
-:orphan:
-
-.. _tutorials:
-
-=========
-Tutorials
-=========
-
-Introductory
-============
-
-.. qiskit-card::
-   :header: Qiskit warmup
-   :card_description: An introduction to Qiskit and the primary user workflow.
-   :image: _static/images/logo.png
-   :link: intro_tutorial1.html
-
-Quantum circuits
-================
-
-.. nbgallery::
-   :glob:
-
-   tutorials/circuits/*
-
-Advanced circuits
-=================
-
-.. nbgallery::
-   :glob:
-
-   tutorials/circuits_advanced/*
-
-Classical simulators
-====================
-
-.. nbgallery::
-   :glob:
-
-   tutorials/simulators/*
-
-Algorithms
-==========
-
-.. nbgallery::
-   :glob:
-
-   tutorials/algorithms/*
-
-Operators
-=========
-
-.. nbgallery::
-   :glob:
-
-   tutorials/operators/*
-
-Sample algorithms in Qiskit
-===========================
-
-.. nbgallery::
-   :glob:
-
-   tutorials/textbook/*
-
-.. Hiding - Indices and tables
-   :ref:`genindex`
-   :ref:`modindex`
-   :ref:`search`
pradyunsg commented 1 year ago

Although, sphinx-remove-toctree does not highlight the parent page when you're on a subpage, unlike what happens with navigation_depth.

Ugh, yea, that makes sense. :/

Eric-Arellano commented 1 year ago

@pradyunsg good news! Your suggestion of using sphinx-remove-toctree to remove all API docs results in a 21-minute CI build, rather than 3.5 hours. Compared to 23 minutes for Pytorch on our latest PR. Meaning, I believe all the slowdown is coming from the left sidebar; removing that variable, Furo is in the same ballpark / slightly faster than Pytorch 🎉

So, maybe you'd be open to something like this?

That is, #674 is solely intended as a tool for dev builds, not prod builds.

Additionally, maybe it's possible to optimize get_navigation_tree in Furo, but that's not guaranteed to work.

Thoughts?

we add navigation_depth to Furo so they're as fast as possible (https://github.com/pradyunsg/furo/pull/674)

If you're not comfortable merging #674, then I think we'll use sphinx-remove-toctree for dev builds to get a similar speedup. But #674 would be great because it's:

  1. More ergonomic to set up,
  2. More powerful. In dev builds, we'd want navigation_depth=2 or 3 since it's "fast enough" and makes the dev build closer to prod. We can't get that precision with sphinx-remove-toctree because it doesn't have the concept of nesting levels.

We'd love to make it easier to help other Furo users facing similar build speed issues.

coruscating commented 1 year ago

@pradyunsg Is it possible to introduce the option of rendering the sidebar as a separate asset? It would solve a lot of problems for us if the sidebar can be cached user-side. Here's some toy code for conf.py that writes _static/sidebar.html once with the toctree:

def setup(app):
    ...
    app.connect("html-page-context", write_sidebar)

def write_sidebar(app, pagename, templatename, context, doctree):
    # only run once
    if pagename != "index":
        return

    sidebar_html = context["toctree"](
        maxdepth=-1, collapse=True, titles_only=True, includehidden=True
    )

    output_path = join(app.builder.outdir, "_static", "sidebar.html")
    with open(output_path, "w") as f:
        # using Furo's specific tree modifications
        f.write(get_navigation_tree(sidebar_html))

And then navigation.html can be overridden to load content from _static/sidebar.html instead of context["furo_navigation_tree"]. Of course more Javascript would be needed on top of this, for example to highlight the current page in the toctree.

Eric-Arellano commented 1 year ago

To close the loop, we were able to get acceptable speeds by

1) reorganizing our API docs to have fewer pages, and 2) distinguishing between dev vs prod builds. See https://github.com/Qiskit/qiskit/pull/10624.

Interestingly, viewcode results in a 14 minute slowdown for Qiskit when using Furo! I think it was only about a 30 second slowdown when using the old Pytorch theme.

pradyunsg commented 1 year ago

@pradyunsg Is it possible to introduce the option of rendering the sidebar as a separate asset?

I'd be open to that, as an opt-in option. I'm wary of the complexity of this solution somewhat -- but we can investigate that separately perhaps.