sphinx-doc / sphinx

The Sphinx documentation generator
https://www.sphinx-doc.org/
Other
6.61k stars 2.13k forks source link

Remove ``has_equations`` from the mathematics domain #13044

Open AA-Turner opened 1 month ago

AA-Turner commented 1 month ago

Feature or Bugfix

Purpose

When profiling, MathDomain.process_doc() takes around 1% of runtime. We can reduce this to ~0.

chrisjsewell commented 1 month ago

Heya, you do realise this is used by one of sphinx's own extensions 😅: https://github.com/sphinx-doc/sphinxcontrib-jsmath/blob/19763d7fc9ebb29eb2f325fef0bc6f067907a233/sphinxcontrib/jsmath/__init__.py#L70 it is also used here: https://github.com/jbms/sphinx-immaterial/blob/69e685aedd338f6766ddb5bae9b2189d2e7a6898/sphinx_immaterial/local_mathjax.py#L28

Calling a method internal, just because it is undocumented, is incorrect; if does not start with an _ it is public.

Not saying not to change, but the new means of detection should also be public and documented

AA-Turner commented 1 month ago

Calling a method internal, just because it is undocumented, is incorrect; if does not start with an _ it is public.

Whilst we are now using this philosophy for new code, I don't think it holds true for old code, where marking names as private was less deliberate.

Both cases you cite are effective copies of ext.mathjax, but I will see if there is a means to improve compatability (at the very least, restore MathDomain.has_equations() and always return True).

A

picnixz commented 4 weeks ago

I think we should restore has_equations(). On the other hand, I'm wondering whether has_equations could be cached or if there was a reason why I didn't consider it at that time. For instance:

return (
    self.data['has_equations'].get(docname, False)
    or any(map(self.has_equations, self.env.toctree_includes.get(docname, ())))
)

works well the first time, but I'm wondering if we shouldn't just do

has_equations = self.data['has_equations'].get(docname, False)
if has_equations:
    return True

has_equations = any(map(self.has_equations, self.env.toctree_includes.get(docname, ())))
if has_equations:
    self.data['has_equations'][docname] = True
return has_equations

This could at least save re-testing has_equations. We can also save some memory by only keeping the docnames that have equations in a set instead of a dict.

AA-Turner commented 4 weeks ago

The slow bit isn't has_equations, but process_doc - iterating through every node of every document takes a long time.

chrisjsewell commented 4 weeks ago

I don't think it holds true for old code, where marking names as private was less deliberate.

I really think it does; users will not care about your standards, if something is not _ they have every right to use it and probably have (a bug s a feature lol) ... but I don't want to block on this; I don't think has_equations is that important a hill to die on 😄