JulienPalard commented 1 year ago

Documentation

Running sphinx-build in nit-picky mode, like:

sphinx-build -n . build/html/

or:

make html SPHINXERRORHANDLING=-n

gives tons warnings. ~8k of them at the time of writing.

Most of them are innocent, like using :const: while referring to a constant defined in another file, or using :meth:`__init__` to speak about the concept of an init method.

The fall in several cases:

The target is not documented at all, then document it, like in 664aa94b570a4a8f3535efb2e3d638a4ab655943
There's a typo, then fix it.
We're mentionning something that does not exists (or no longer exists), then rewrite.
Innocent usage of rst markers without the intention of linking to something, then ???

I don't have statistics (yet) on the distribution for the 4 previous cases.

For the innocent usages of rst markers, there's two fixes:

Drop the role, :const:`IGNORECASE` becomes ``IGNORECASE``. We loose a bit of information in the rst file, I'm not fully convinced.
Make it point to something existing, :const:`IGNORECASE` becomes :const:`re.IGNORECASE`, or even :const:`~re.IGNORECASE` to keep the same output. It means more typing, and more links in the HTML output which are not all usefull.

My question is: should we try to fix all warnings so we can easily spot typos at build time?

I tried to see if some typo would have been avoided by the nit-picky mode by reading a few pages of WARNINGS and found just one: read1 instead of read in bz2.rst.

Linked PRs

gh-101122
gh-101173
gh-102125
gh-102237
gh-102238
gh-102260
gh-102262
gh-102263
gh-102274
gh-102291
gh-102292
gh-102340
gh-102513
gh-102526
gh-102529
gh-102530
gh-102597
gh-102638
gh-102639
gh-102695
gh-102727
gh-102728
gh-103001
gh-103002
gh-103019
gh-103116
gh-103135
gh-103191
gh-103289
gh-103457
gh-103531
gh-103803
gh-103818
gh-104124
gh-105265
gh-106172
gh-106191
gh-106192
gh-106204
gh-106205
gh-106220
gh-106278
gh-106281
gh-106282
gh-106299
gh-106460
gh-107095
gh-107103
gh-107120
gh-107386
gh-107418
gh-107419
gh-107966
gh-108065
gh-108127
gh-108678
gh-108680
gh-108681
gh-108684
gh-108707
gh-108708
gh-108746
gh-108750
gh-108755
gh-108756
gh-108757
gh-108759
gh-108805
gh-108807
gh-108808
gh-108810
gh-108812
gh-108813
gh-109394
gh-109416
gh-109417
gh-109424
gh-109799
gh-109800
gh-109801
gh-109814
gh-109881
gh-109883
gh-109884
gh-109931
gh-109937
gh-109938
gh-109963
gh-109966
gh-109967
gh-110074
gh-110081
gh-110082
gh-110084
gh-110085
gh-110086
gh-110087
gh-110112
gh-110113
gh-110114
gh-110115
gh-110118
gh-110135
gh-110136
gh-110140
gh-110141
gh-110144
gh-110185
gh-110187
gh-110207
gh-110461
gh-110623
gh-110624
gh-110841
gh-110855
gh-110856
gh-110862
gh-110877
gh-110878
gh-110979
gh-111070
gh-111071
gh-111072
gh-111073
gh-111074
gh-111075
gh-111076
gh-111079
gh-111080
gh-111097
gh-111098
gh-111118
gh-111173
gh-111175
gh-111176
gh-111179
gh-111185
gh-111186
gh-111222
gh-111226
gh-111227
gh-111469
gh-111470
gh-112351
gh-112357
gh-112366
gh-112373
gh-112381
gh-112382
gh-112390
gh-112391
gh-112392
gh-112393
gh-112399
gh-112402
gh-112404
gh-112416
gh-112420
gh-112422
gh-112455
gh-112456
gh-112662
gh-112666
gh-112667
gh-112668
gh-112669
gh-112673
gh-112674
gh-112697
gh-112698
gh-112701
gh-112702
gh-112703
gh-112704
gh-112705
gh-112735
gh-112737
gh-112748
gh-112749
gh-112757
gh-112772
gh-112775
gh-112781
gh-112789
gh-112790
gh-112811
gh-112813
gh-112815
gh-112816
gh-112817
gh-112832
gh-112833
gh-112836
gh-112857
gh-112858
gh-112868
gh-112869
gh-112870
gh-112872
gh-112873
gh-112874
gh-112875
gh-112884
gh-112886
gh-112908
gh-112910
gh-112911
gh-112912
gh-112913
gh-112929
gh-112930
gh-112933
gh-112974
gh-112981
gh-113001
gh-113003
gh-113029
gh-113030
gh-113031
gh-113043
gh-113044
gh-113057
gh-113061
gh-113062
gh-113106
gh-113107
gh-113109
gh-113110
gh-113111
gh-113112
gh-113116
gh-113124
gh-113125
gh-113136
gh-113137
gh-113142
gh-113143
gh-113144
gh-113145
gh-113158
gh-113159
gh-113162
gh-113163
gh-113181
gh-113182
gh-113183
gh-113184
gh-113237
gh-113244
gh-113245
gh-113289
gh-113290
gh-113291
gh-113493
gh-113494
gh-113496
gh-113497
gh-113498
gh-113500
gh-113502
gh-113504
gh-113505
gh-113508
gh-113509
gh-113510
gh-113511
gh-113551
gh-113552
gh-113564
gh-113598
gh-113599
gh-113600
gh-113629
gh-113641
gh-113642
gh-113669
gh-113681
gh-113725
gh-113734
gh-113735
gh-113739
gh-113748
gh-113749
gh-113752
gh-113846
gh-113847
gh-113946
gh-113998
gh-114001
gh-114060
gh-114063
gh-114064
gh-114194
gh-114280
gh-114327
gh-114373
gh-114377
gh-114378
gh-114425
gh-114436
gh-114437
gh-114469
gh-114477
gh-114478
gh-114518
gh-114519
gh-114521
gh-114525
gh-114526
gh-114527
gh-114531
gh-114546
gh-114584
gh-114585
gh-114635
gh-114640
gh-114641
gh-114646
gh-114649
gh-114650
gh-114652
gh-114654
gh-114658
gh-114661
gh-114669
gh-114696
gh-114704
gh-114705
gh-114711
gh-114712
gh-114716
gh-114718
gh-114729
gh-114769
gh-114770
gh-114771
gh-114773
gh-114786
gh-114793
gh-114794
gh-114825
gh-114846
gh-114872
gh-114878
gh-114958
gh-114969
gh-114970
gh-114972
gh-114981
gh-114982
gh-114983
gh-114988
gh-114989
gh-114992
gh-114999
gh-115003
gh-115135
gh-115141
gh-115208
gh-115209
gh-115263
gh-115284
gh-115297
gh-115308
gh-115310
gh-115311
gh-115319
gh-115330
gh-115331
gh-115430
gh-115575
gh-115580
gh-115587
gh-115588
gh-115589
gh-115590
gh-115691
gh-115902
gh-115903
gh-115924
gh-115925
gh-115932
gh-115933
gh-116913
gh-117037
gh-117038
gh-118353
gh-118356
gh-118364
gh-118365
gh-118366
gh-118367
gh-124106
gh-124480
gh-124556
gh-124558
gh-124577
gh-124579
gh-124580
gh-124709
gh-125190
gh-125191
gh-125208
gh-125211

encukou commented 1 year ago

IMO, we should fix these. Many are actual issues in the documentation.

Here's a proof-of-concept GH Action that complains about nitpicks in changed files only: https://github.com/encukou/cpython/pull/21/commits/e97d91f71e336c474b047335f0084b2eb95a121e Feel free to take it, I don't think I'll have time for this soon.

CAM-Gerlach commented 1 year ago

IMO, we should fix these. Many are actual issues in the documentation.

Yes, definitely; I've been gradually helping on the docs I've/we've touched anyway, though it is a long term project. I've found a number of things that were undocumented, as well as a number of other real issues through that, e.g. on the sqlite3 module that Erlend and I fixed.

Most of them are innocent, like using :const: while referring to a constant defined in another file, or using :meth:__init__ to speak about the concept of an init method.

I'd argue neither is exactly innocent:

For the first, unless there's some particular reason not to, it usually makes sense to actually cross reference the constant, so the reader can find out more, and be explicit about it's location at least in the source (since ~ can hide that in the rendered output if not desired). This also means if the constant is renamed, moved or removed, Sphinx will issue a warning about it.
For the second, doing :meth:`~object.__init__` to cross reference the full description of an __init__ method will often still have value to readers, at least on first usage in a context, and doesn't take up any more space. We could potentially extend the :meth: role to automatically prepend a default name, i.e. object, to :meth: and :attr: that only have a single component, and perform the lookup with that—though the Sphinx built in roles are not the best structured to be able to extend easily without copying a fair bit of code.

For the innocent usages of rst markers, there's two fixes:

There's a third, much simpler and better fix. If the link isn't desired, every Sphinx cross-reference role supports prepending ! to prevent Sphinx from trying to resolve the reference, which avoids both the warning and the (very slight) build-time lookup cost, while preserving both all the information in the source, and the formatting in the output (since the formatting of named roles and regular literals is not the same, at least in our current theme), while being easier than both (just add one character). So, you could just do:

:const:`!IGNORECASE`

sobolevn commented 1 year ago

I've sent an example PR with the fix for enum module: https://github.com/python/cpython/pull/101122

The dev experience was plesant, because on the second run sphinx only rebuilds (and warns about) changed files:

CAM-Gerlach commented 1 year ago

As a general note of caution, especially when submitting PRs fixing these sorts of widespread and potentially nitpicky (heh) docs defects, especially when in cases like this there are a number of different possibilities to handle each warning instance, we should take care to avoid the folly of large "omnibus" PRs (as Guido likes to call them) and take care to consider each specific change we're making holistically, and ensure that we're picking the approach that best serves the reader in for each specific context, as opposed to just applying one type of fix mechanically to all instances, or even worse arbitrarily picking one or another each time without careful thought and consideration that might fix the warning but be an overall regression.

Otherwise, if we're too focused solely on the narrow objective of getting rid of the warnings by whatever means, as opposed to the broader goal of improving the overall quality of the docs, we risk both doing exactly the opposite of the latter, and consuming the limited time and churn budget of core developers and other reviewers on changes with little or even negative practical benefit.

encukou commented 1 year ago

That's the reasoning behind teaching the CI to only warn, and only on changed files.

hugovk commented 1 year ago

To enable this, it would be really useful if Sphinx had more granular config for the warnings/errors, similar to Flake8.

For example, we could enable nitpicky only for certain directories and files, and expand as more are cleaned. And possibly in combination with an exclude option.

Similarly, we may only want to enable/disable nitpicky for certain classes of warnings/errors.

That would allow us to iteratively fix things, and keep them fixed.

cc @AA-Turner

hugovk commented 1 year ago

PR to fix 113 warnings in the decimal module: https://github.com/python/cpython/pull/102125

CAM-Gerlach commented 1 year ago

Yeah; you can silence warnings for particular names, but not in particular files, which would IMO be much more useful. Besides just incrementally fixing these issues, it would also be very useful to potentially keep permanently for What's New and Changelog pages (other than those for the current feature version in a particular branch) as those will naturally drift out of date over time. I believe I mentioned this on a Sphinx issue somewhere at some point semi-recently, but if I did I can't seem to find it now.

timobrembeck commented 1 year ago

Does anybody have an opinion on the docstring-side of matters? In other words, should the docstrings inside the Python source code also comply to Sphinx's nit-picky mode? (see e.g. #100989)

CAM-Gerlach commented 1 year ago

IMO, it's generally helpful for the docstrings to use the unambiguous, explicit and precise types if feasible, but particularly for the docstrings, avoiding warnings in -n mode seems secondary to me to ensuring they are clear, helpful and consistent for readers, per Diataxis on the overall function of reference docs, particularly since CPython itself doesn't build the docstrings. If the latter can be satisfied while serving the former purpose and being enough of a non-trivial and thoughtful improvement to not quality as mechanical churn, then it would seem to me to have net value.

In the particular case of your PR #100990 , it appears to make substantial clarity and descriptiveness improvements to the docstrings beyond just the above change (which is really secondary in benefit to the latter), so to me it appears to be a pretty clear net win.

hugovk commented 1 year ago

As discussed in yesterday's Documentation Community Team Meeting, please see PR https://github.com/python/cpython/pull/102513 to add two nit-picky checks to the CI:

show Sphinx warnings in changed files, can’t fail
show Sphinx warnings in required-list (e.g. What’s New in 3.12), can fail

See also PR https://github.com/python/cpython/pull/102340 to fix Sphinx warnings in the turtle module.

terryjreedy commented 1 year ago

Would it be possible for the warning to include a link to possible fixups for the particular warning? I don't remember reading about '~object' or '!CON' and I imaging others are similarly deficient in sphinx-fu.

CAM-Gerlach commented 1 year ago

Would it be possible for the warning to include a link to possible fixups for the particular warning?

That would be a great idea, but its kinda hard to programmatically determine the most appropriate solution for a particular case without considering the specifics and context of the situation, which can be very quick for a human like us with the specific expertise and experience (really, the latter more than the former, since the syntax can be a lot easier to grasp than the semantics) but much harder to write reliable prescriptive rules for.

IMO, it would be better for most folks to just not worry about it at all and instead just tag a docs team member (like me) and we can pop over and make a one-click suggestion (or commit) with the best fix if desired. For these kind of issues, I can have a <24 hour, perhaps close to 12-hour SLA.

I don't remember reading about '~object' or '!CON' and I imaging others are similarly deficient in sphinx-fu.

@ezio-melotti made a nice quick-reference table with all of the common syntax bits, including those, over on the devguide, and the Sphinx docs also has a concise primer covering those as well, but TL;DR these are the main cases:

If the object exists and should be formally documented but isn't, either document it or leave the warning until it is
If an object exists but is unintentionally broken due to the wrong class, module, etc. prefix, a typo, etc, fix it
If an object is missing an implicit prefix that should not be displayed in the final output, like object for dunder methods or the class name when methods are listed individually, add the prefix with a prepended a ~ to hide it (e.g. :meth:`~object.__str__`, the :meth:`~Spam.get_foo` method of the :class:`Spam` class)
If an object was removed from Python, is deliberately undocumented or is a subsequent reference to something already linked in the the same scope, prepend ! to not resolve the reference (e.g. :meth:`!Spam.removed_method`, :c:func:`!_internal_api`)
If an object is an example or not otherwise a "real" class, function, etc., either do the above or use a literal (e.g. ``ExampleClass.example_attr`` ``fake_func()``).

serhiy-storchaka commented 1 year ago

I did not use it, but I seen other projects use something like

.. c:type:: wchar_t
   :hidden:

for silencing warnings about :c:type:`wchar_t`.

Perhaps we can add similar hidden entries for not defined but referenced names.

vstinner commented 1 year ago

I created PR #107298 to fix some Sphinx warnings in the C API Documentation.

CAM-Gerlach commented 1 year ago

For newer contributors looking to tackle this, be advised there's no one size fits all approach that's the right one for every warning; instead, there are a handful of common cases covering the great majority of them, and a few few edge/special cases for the rest.

I'd like to write up a proper mini-guide for this, but my bandwidth is limited at the moment. However, here's an overview of how to handle the most common cases. If you have something that isn't covered here, feel free to reply and suggest it be added, or if you have questions about a special case, I'm always happy to answer.

In summary, here's a basic algorithm for determining such:

Check the named target. If it is:
- clearly not something that should be a linkable cross reference (example object, sample code, etc), replace the ref role with literals.
- a standard C function, system call, environment variable, etc, add it to conf.py.
- a documented function with parameters specified in the ref, specify just the callable name inside <> after the existing text
- A generic dunder method (e.g. __init__), prepend ~object. (or a more specific class name, if present)
Search the target name and variations using the Python docs search and/or sphobjinv. If found:
- Determine the most appropriate target (if multiple), and fix the cross reference per Table 1
If not found in latest docs, search old versions of the docs, e.g. 5 versions back or using Google for it (see Table 3). If found:
- If important to still keep in current docs, prepend the target name with !
- Otherwise, remove the ref target and its relevant context
If seems to be a not-documented target:
- Add/fix the reference target, per Table 2
You've found an uncovered/special case.
- Ask here for more help.

Here's a more detailed listing of specific cases with solutions and examples, divided into a few base categories. Tables are in rough order of commonality.

Target exists; ref doesn't correctly point to it

Solution: Fix the cross reference.

Table 1

Case	Solution	Example
Ref target missing class, module, etc prefix	Add missing prefix, with `~` to hide it as appropriate	:func:`exit` in :mod:`sys` -> :func:`~sys.exit` in :mod:`sys`
Generic dunder method (e.g. `__init__`) missing prefix	Add missing prefix: A custom class if specifically documented under such, else `~object`	:meth:`__init__` -> :meth:`~object.__init__`
Missing domain prefix (`c:`, etc)	Add domain prefix	:func:`PyObject_Call` -> :c:func:`PyObject_Call`
Wrong role used	Use correct role	:class:`dataclasses.dataclass` -> :func:`dataclasses.dataclass`
Typo/mistake in target name	Fix typo/mistake	:func:`dataclass.dataclasses` -> :func:`dataclasses.dataclass`
API documented under other alias	Change cross-reference to "canonical" name, possibly keeping alias as displayed text	:meth:`turtle.RawTurtle.tiltangle` -> :func:`turtle.RawTurtle.titlangle <turtle.tiltangle>`
Documented callable with parameters	Set title to full call and target to just callable cname	:func:`sys.exit(0)` -> :func:`sys.exit(0) <sys.exit>`

Target doesn't exist and should

Solution: Add/fix the target.

Table 2

Case	Solution	Example
API not formally documented	Properly document API	:func:`enum.show_flag_values` -> Keep plus document, e.g. `.. function:: show_flag_values(...)`
API documented under wrong directive	Fix API docs to use correct directive & update xrefs	:c:func:`PyObject_Call` with `.. function:: PyObject_Call(...)` -> Fix latter to `.. c:function:: PyObject_Call(...)`
API documented under wrong name	Fix API docs to use correct name; add aliases to using ref labels so fragment UIRs don't break	:meth:`Cursor.execute` with `.. method:: Cursor.execute()` underneath `.. class:: Cursor` (double class name, means target is `Cursor.Cursor.execute`) -> Fix directive to use just `.. method:: execute()`

Target doesn't exist, and used to

Solution: De-resolve reference if important to still keep; otherwise remove xref and relevant context.

Table 3

Case	Solution	Example
Ref target obsolete/removed	Either remove obsolete reference & associated context, or if necessary to keep, prepend `!`	:mod:`distutils` -> :mod:`!distutils` (or remove)
`:ref:` target label should/used to exist, but doesn't	Find where it used to/should point to, adding one if needed (or remove reference if obsolete)	:ref:`old-section-name` -> :ref:`new-section-name` (or add alias)

Target doesn't exist, and shouldn't

Solution: Remove, silence or ignore the cross reference

Table 4

Case	Solution	Example
Example object	Use literals instead	:meth:`Spam.eggs` -> ``Spam.eggs()``
Sample code	Use literals instead	:func:`print("An example use of print")` -> ``print("An example use of print")``
C stdlib/system call	Add to `conf.py` ignore list	:c:func:`printf` -> Leave as-is, ignore in `conf.py`
Standard environment variable	Add to `conf.py` ignore list	:envvar:`PATH` -> Leave as-is, ignore in `conf.py`

serhiy-storchaka commented 1 year ago

Great guide, @CAM-Gerlach. Some minor corrections:

It is .. function:: and .. c:function:: directives, not .. func:: and .. :c:func::.
:func: and :meth: roles automatically add (). If you convert references into literal text, add explicit () after callable name. :meth:`Spam.eggs` -> ``Spam.eggs()`` or :meth:`!Spam.eggs`.
Typo :c:func:`prinf`.
Example :role:`target` -> :role:`target` is not clear.

CAM-Gerlach commented 1 year ago

Thanks so much for the detailed review, @serhiy-storchaka ! Very impressive to find all those all those typos. I've edited my post to fix all of them. If you have more fixes, or suggestions for other cases to cover, keep 'em coming, thanks!

It is .. function:: and .. c:function:: directives, not .. func:: and .. :c:func::.

Oops, yeah of course—I was getting so used to typing the role syntax over and over I was just on autopilot.

:func: and :meth: roles automatically add (). If you convert references into literal text, add explicit () after callable name. :meth:`Spam.eggs` -> ``Spam.eggs() or :meth:!Spam.eggs ``.

Yeah—just wasn't being as careful as you here to replicate the rendered form.

Example :role:`target` -> :role:`target` is not clear.

Actually, you found a placeholder I forgot to replace—added an actual example there, thanks.

AlexWaygood commented 11 months ago

Where should module attributes be documented? We currently have them documented in at least three places:

We can reduce duplication and fix a bunch of warnings by deleting two of these, and linking from those places to a single canonical reference. But which is the most appropriate to keep?

serhiy-storchaka commented 9 months ago

114546 created incorrect references and index entries.

hugovk commented 9 months ago

See https://discuss.python.org/t/broken-references-in-sphinx-docs/19463/7 for a one-year progress update! 🧹📚

bedevere-app[bot] commented 9 months ago

GH-114771 is a backport of this pull request to the 3.12 branch.

bedevere-app[bot] commented 9 months ago

GH-114773 is a backport of this pull request to the 3.11 branch.

bedevere-app[bot] commented 9 months ago

GH-115310 is a backport of this pull request to the 3.11 branch.

bedevere-app[bot] commented 9 months ago

GH-115311 is a backport of this pull request to the 3.11 branch.

python / cpython

Fix all Sphinx reference warnings in the documentation #101100

Documentation

Linked PRs

Target exists; ref doesn't correctly point to it

Target doesn't exist and should

Target doesn't exist, and used to

Target doesn't exist, and shouldn't

114546 created incorrect references and index entries.