python / cpython

The Python programming language
https://www.python.org
Other
63.51k stars 30.42k forks source link

Fix all Sphinx reference warnings in the documentation #101100

Open JulienPalard opened 1 year ago

JulienPalard commented 1 year ago

Documentation

Running sphinx-build in nit-picky mode, like:

sphinx-build -n . build/html/

or:

make html SPHINXERRORHANDLING=-n

gives tons warnings. ~8k of them at the time of writing.

Most of them are innocent, like using :const: while referring to a constant defined in another file, or using :meth:`__init__` to speak about the concept of an init method.

The fall in several cases:

I don't have statistics (yet) on the distribution for the 4 previous cases.

For the innocent usages of rst markers, there's two fixes:

My question is: should we try to fix all warnings so we can easily spot typos at build time?

I tried to see if some typo would have been avoided by the nit-picky mode by reading a few pages of WARNINGS and found just one: read1 instead of read in bz2.rst.

Linked PRs

encukou commented 1 year ago

See also: https://discuss.python.org/t/broken-references-in-sphinx-docs/19463

IMO, we should fix these. Many are actual issues in the documentation.

Here's a proof-of-concept GH Action that complains about nitpicks in changed files only: https://github.com/encukou/cpython/pull/21/commits/e97d91f71e336c474b047335f0084b2eb95a121e Feel free to take it, I don't think I'll have time for this soon.

CAM-Gerlach commented 1 year ago

IMO, we should fix these. Many are actual issues in the documentation.

Yes, definitely; I've been gradually helping on the docs I've/we've touched anyway, though it is a long term project. I've found a number of things that were undocumented, as well as a number of other real issues through that, e.g. on the sqlite3 module that Erlend and I fixed.

Most of them are innocent, like using :const: while referring to a constant defined in another file, or using :meth:__init__ to speak about the concept of an init method.

I'd argue neither is exactly innocent:

For the innocent usages of rst markers, there's two fixes:

There's a third, much simpler and better fix. If the link isn't desired, every Sphinx cross-reference role supports prepending ! to prevent Sphinx from trying to resolve the reference, which avoids both the warning and the (very slight) build-time lookup cost, while preserving both all the information in the source, and the formatting in the output (since the formatting of named roles and regular literals is not the same, at least in our current theme), while being easier than both (just add one character). So, you could just do:

:const:`!IGNORECASE`

sobolevn commented 1 year ago

I've sent an example PR with the fix for enum module: https://github.com/python/cpython/pull/101122

The dev experience was plesant, because on the second run sphinx only rebuilds (and warns about) changed files:

Снимок экрана 2023-01-18 в 11 27 00
CAM-Gerlach commented 1 year ago

As a general note of caution, especially when submitting PRs fixing these sorts of widespread and potentially nitpicky (heh) docs defects, especially when in cases like this there are a number of different possibilities to handle each warning instance, we should take care to avoid the folly of large "omnibus" PRs (as Guido likes to call them) and take care to consider each specific change we're making holistically, and ensure that we're picking the approach that best serves the reader in for each specific context, as opposed to just applying one type of fix mechanically to all instances, or even worse arbitrarily picking one or another each time without careful thought and consideration that might fix the warning but be an overall regression.

Otherwise, if we're too focused solely on the narrow objective of getting rid of the warnings by whatever means, as opposed to the broader goal of improving the overall quality of the docs, we risk both doing exactly the opposite of the latter, and consuming the limited time and churn budget of core developers and other reviewers on changes with little or even negative practical benefit.

encukou commented 1 year ago

That's the reasoning behind teaching the CI to only warn, and only on changed files.

hugovk commented 1 year ago

To enable this, it would be really useful if Sphinx had more granular config for the warnings/errors, similar to Flake8.

For example, we could enable nitpicky only for certain directories and files, and expand as more are cleaned. And possibly in combination with an exclude option.

Similarly, we may only want to enable/disable nitpicky for certain classes of warnings/errors.

That would allow us to iteratively fix things, and keep them fixed.

cc @AA-Turner

hugovk commented 1 year ago

PR to fix 113 warnings in the decimal module: https://github.com/python/cpython/pull/102125

CAM-Gerlach commented 1 year ago

Yeah; you can silence warnings for particular names, but not in particular files, which would IMO be much more useful. Besides just incrementally fixing these issues, it would also be very useful to potentially keep permanently for What's New and Changelog pages (other than those for the current feature version in a particular branch) as those will naturally drift out of date over time. I believe I mentioned this on a Sphinx issue somewhere at some point semi-recently, but if I did I can't seem to find it now.

timobrembeck commented 1 year ago

Does anybody have an opinion on the docstring-side of matters? In other words, should the docstrings inside the Python source code also comply to Sphinx's nit-picky mode? (see e.g. #100989)

CAM-Gerlach commented 1 year ago

IMO, it's generally helpful for the docstrings to use the unambiguous, explicit and precise types if feasible, but particularly for the docstrings, avoiding warnings in -n mode seems secondary to me to ensuring they are clear, helpful and consistent for readers, per Diataxis on the overall function of reference docs, particularly since CPython itself doesn't build the docstrings. If the latter can be satisfied while serving the former purpose and being enough of a non-trivial and thoughtful improvement to not quality as mechanical churn, then it would seem to me to have net value.

In the particular case of your PR #100990 , it appears to make substantial clarity and descriptiveness improvements to the docstrings beyond just the above change (which is really secondary in benefit to the latter), so to me it appears to be a pretty clear net win.

hugovk commented 1 year ago

As discussed in yesterday's Documentation Community Team Meeting, please see PR https://github.com/python/cpython/pull/102513 to add two nit-picky checks to the CI:

  1. show Sphinx warnings in changed files, can’t fail
  2. show Sphinx warnings in required-list (e.g. What’s New in 3.12), can fail

See also PR https://github.com/python/cpython/pull/102340 to fix Sphinx warnings in the turtle module.

terryjreedy commented 1 year ago

Would it be possible for the warning to include a link to possible fixups for the particular warning? I don't remember reading about '~object' or '!CON' and I imaging others are similarly deficient in sphinx-fu.

CAM-Gerlach commented 1 year ago

Would it be possible for the warning to include a link to possible fixups for the particular warning?

That would be a great idea, but its kinda hard to programmatically determine the most appropriate solution for a particular case without considering the specifics and context of the situation, which can be very quick for a human like us with the specific expertise and experience (really, the latter more than the former, since the syntax can be a lot easier to grasp than the semantics) but much harder to write reliable prescriptive rules for.

IMO, it would be better for most folks to just not worry about it at all and instead just tag a docs team member (like me) and we can pop over and make a one-click suggestion (or commit) with the best fix if desired. For these kind of issues, I can have a <24 hour, perhaps close to 12-hour SLA.

I don't remember reading about '~object' or '!CON' and I imaging others are similarly deficient in sphinx-fu.

@ezio-melotti made a nice quick-reference table with all of the common syntax bits, including those, over on the devguide, and the Sphinx docs also has a concise primer covering those as well, but TL;DR these are the main cases:

serhiy-storchaka commented 1 year ago

I did not use it, but I seen other projects use something like

.. c:type:: wchar_t
   :hidden:

for silencing warnings about :c:type:`wchar_t`.

Perhaps we can add similar hidden entries for not defined but referenced names.

vstinner commented 1 year ago

I created PR #107298 to fix some Sphinx warnings in the C API Documentation.

CAM-Gerlach commented 1 year ago

For newer contributors looking to tackle this, be advised there's no one size fits all approach that's the right one for every warning; instead, there are a handful of common cases covering the great majority of them, and a few few edge/special cases for the rest.

I'd like to write up a proper mini-guide for this, but my bandwidth is limited at the moment. However, here's an overview of how to handle the most common cases. If you have something that isn't covered here, feel free to reply and suggest it be added, or if you have questions about a special case, I'm always happy to answer.

In summary, here's a basic algorithm for determining such:

  1. Check the named target. If it is:
    • clearly not something that should be a linkable cross reference (example object, sample code, etc), replace the ref role with literals.
    • a standard C function, system call, environment variable, etc, add it to conf.py.
    • a documented function with parameters specified in the ref, specify just the callable name inside <> after the existing text
    • A generic dunder method (e.g. __init__), prepend ~object. (or a more specific class name, if present)
  2. Search the target name and variations using the Python docs search and/or sphobjinv. If found:
    • Determine the most appropriate target (if multiple), and fix the cross reference per Table 1
  3. If not found in latest docs, search old versions of the docs, e.g. 5 versions back or using Google for it (see Table 3). If found:
    • If important to still keep in current docs, prepend the target name with !
    • Otherwise, remove the ref target and its relevant context
  4. If seems to be a not-documented target:
    • Add/fix the reference target, per Table 2
  5. You've found an uncovered/special case.
    • Ask here for more help.

Here's a more detailed listing of specific cases with solutions and examples, divided into a few base categories. Tables are in rough order of commonality.

Target exists; ref doesn't correctly point to it

Solution: Fix the cross reference.

Table 1

Case Solution Example
Ref target missing class, module, etc prefix Add missing prefix, with ~ to hide it as appropriate :func:`exit` in :mod:`sys` -> :func:`~sys.exit` in :mod:`sys`
Generic dunder method (e.g. __init__) missing prefix Add missing prefix: A custom class if specifically documented under such, else ~object :meth:`__init__` -> :meth:`~object.__init__`
Missing domain prefix (c:, etc) Add domain prefix :func:`PyObject_Call` -> :c:func:`PyObject_Call`
Wrong role used Use correct role :class:`dataclasses.dataclass` -> :func:`dataclasses.dataclass`
Typo/mistake in target name Fix typo/mistake :func:`dataclass.dataclasses` -> :func:`dataclasses.dataclass`
API documented under other alias Change cross-reference to "canonical" name, possibly keeping alias as displayed text :meth:`turtle.RawTurtle.tiltangle` -> :func:`turtle.RawTurtle.titlangle <turtle.tiltangle>`
Documented callable with parameters Set title to full call and target to just callable cname :func:`sys.exit(0)` -> :func:`sys.exit(0) <sys.exit>`

Target doesn't exist and should

Solution: Add/fix the target.

Table 2

Case Solution Example
API not formally documented Properly document API :func:`enum.show_flag_values` -> Keep plus document, e.g. .. function:: show_flag_values(...)
API documented under wrong directive Fix API docs to use correct directive & update xrefs :c:func:`PyObject_Call` with .. function:: PyObject_Call(...) -> Fix latter to .. c:function:: PyObject_Call(...)
API documented under wrong name Fix API docs to use correct name; add aliases to using ref labels so fragment UIRs don't break :meth:`Cursor.execute` with .. method:: Cursor.execute() underneath .. class:: Cursor (double class name, means target is Cursor.Cursor.execute) -> Fix directive to use just .. method:: execute()

Target doesn't exist, and used to

Solution: De-resolve reference if important to still keep; otherwise remove xref and relevant context.

Table 3

Case Solution Example
Ref target obsolete/removed Either remove obsolete reference & associated context, or if necessary to keep, prepend ! :mod:`distutils` -> :mod:`!distutils` (or remove)
:ref: target label should/used to exist, but doesn't Find where it used to/should point to, adding one if needed (or remove reference if obsolete) :ref:`old-section-name` -> :ref:`new-section-name` (or add alias)

Target doesn't exist, and shouldn't

Solution: Remove, silence or ignore the cross reference

Table 4

Case Solution Example
Example object Use literals instead :meth:`Spam.eggs` -> ``Spam.eggs()``
Sample code Use literals instead :func:`print("An example use of print")` -> ``print("An example use of print")``
C stdlib/system call Add to conf.py ignore list :c:func:`printf` -> Leave as-is, ignore in conf.py
Standard environment variable Add to conf.py ignore list :envvar:`PATH` -> Leave as-is, ignore in conf.py
serhiy-storchaka commented 1 year ago

Great guide, @CAM-Gerlach. Some minor corrections:

CAM-Gerlach commented 1 year ago

Thanks so much for the detailed review, @serhiy-storchaka ! Very impressive to find all those all those typos. I've edited my post to fix all of them. If you have more fixes, or suggestions for other cases to cover, keep 'em coming, thanks!

It is .. function:: and .. c:function:: directives, not .. func:: and .. :c:func::.

Oops, yeah of course—I was getting so used to typing the role syntax over and over I was just on autopilot.

:func: and :meth: roles automatically add (). If you convert references into literal text, add explicit () after callable name. :meth:`Spam.eggs` -> ``Spam.eggs() or :meth:!Spam.eggs ``.

Yeah—just wasn't being as careful as you here to replicate the rendered form.

Example :role:`target` -> :role:`target` is not clear.

Actually, you found a placeholder I forgot to replace—added an actual example there, thanks.

AlexWaygood commented 11 months ago

Where should module attributes be documented? We currently have them documented in at least three places:

We can reduce duplication and fix a bunch of warnings by deleting two of these, and linking from those places to a single canonical reference. But which is the most appropriate to keep?

serhiy-storchaka commented 9 months ago

114546 created incorrect references and index entries.

hugovk commented 9 months ago

See https://discuss.python.org/t/broken-references-in-sphinx-docs/19463/7 for a one-year progress update! 🧹📚

bedevere-app[bot] commented 9 months ago

GH-114771 is a backport of this pull request to the 3.12 branch.

bedevere-app[bot] commented 9 months ago

GH-114773 is a backport of this pull request to the 3.11 branch.

bedevere-app[bot] commented 9 months ago

GH-115310 is a backport of this pull request to the 3.11 branch.

bedevere-app[bot] commented 9 months ago

GH-115311 is a backport of this pull request to the 3.11 branch.