sphinx-doc / sphinx

The Sphinx documentation generator
https://www.sphinx-doc.org/
Other
6.6k stars 2.12k forks source link

duplicate object description warnings #10348

Closed saimn closed 2 years ago

saimn commented 2 years ago

Describe the bug

Hi, Since Sphinx 4.0 we see "duplicate object description" warnings in Astropy and other related projects that are using our sphinx-automodapi extension. For example : https://github.com/astropy/astropy/issues/11723 It took some time to report the issue because we were not sure where the issue comes from (Sphinx or sphinx-automodapi). But our recent investigations seem to show that this could be an issue in Sphinx (https://github.com/astropy/sphinx-automodapi/issues/130).

9121 and #9128 added the duplication warning and the :canonical: metadata, which seems to be applied correctly by autodoc, but it seems we still get the warning when there are more than one duplicate.

With this minimal project to reproduce the issue (https://github.com/maxnoe/sphinx-automodapi-duplicated-warning), we have a first duplicate for duplicated.Foo:

.. py:class:: Foo()
   :module: duplicated
   :canonical: duplicated.foo.foo.Foo

   Bases: :py:class:`object`

   Awesome Foo class

And another one for duplicated.foo.Foo which is the ones triggering the warning:

.. py:class:: Foo()
   :module: duplicated.foo
   :canonical: duplicated.foo.foo.Foo

   Bases: :py:class:`object`

   Awesome Foo class

In https://github.com/sphinx-doc/sphinx/blob/b4276edd848d112b4e981011c334d27cbcb20018/sphinx/domains/python.py#L1218-L1237 when the warning is triggered, we have this:

(Pdb++) pp self.objects
{'duplicated.Foo': ObjectEntry(docname='api/duplicated.Foo', node_id='duplicated.Foo', objtype='class', aliased=False),
 'duplicated.foo.Foo': ObjectEntry(docname='api/duplicated.foo.Foo', node_id='duplicated.foo.Foo', objtype='class', aliased=False),
 'duplicated.foo.foo.Foo': ObjectEntry(docname='api/duplicated.Foo', node_id='duplicated.Foo', objtype='class', aliased=True)}

So is this an issue in Sphinx, in the treatment of multiple duplicates, or there something else we are missing (and we could fix in sphinx-automodapi) ?

How to Reproduce

$ git clone https://github.com/maxnoe/sphinx-automodapi-duplicated-warning.git
$ cd sphinx-automodapi-duplicated-warning
$ pip install .
$ cd docs
$ make html

Expected behavior

No response

Your project

https://github.com/astropy/astropy

Screenshots

No response

OS

Linux

Python version

3.8

Sphinx version

4.0

Sphinx extensions

https://github.com/astropy/sphinx-automodapi

Extra tools

No response

Additional context

No response

cblegare commented 2 years ago

Hello @saimn ! I am not a sphinx dev per se, only trying to help.

I ran the example. Here are calls to sphinx.domains.python.PythonDomain.note_object for classes (I skipped modules for brevity).

  1. note_object
    • name: duplicated.Foo
    • node_id: duplicated.Foo
    • aliased: False
    • docname: api/duplicated.Foo
  2. note_object
    • name: duplicated.foo.foo.Foo (canonical name)
    • node_id: duplicated.Foo
    • aliased: True
    • docname: api/duplicated.Foo
  3. note_object
    • name: duplicated.foo.Foo
    • node_id: duplicated.foo.Foo
    • aliased: False
    • docname: api/duplicated.foo.Foo
  4. note_object
    • name: duplicated.foo.foo.Foo (canonical name)
    • node_id: duplicated.foo.Foo
    • aliased: True
    • docname: api/duplicated.foo.Foo
    • raises warning

Above, the api/duplicated.* docnames are generated by sphinx_automodapi

From my understanding, by using explicitely .. automodapi:: twice with two different modules but that generates documentation for the same class, without further configuration autodoc will emit .. py:class:: with the :canonical: option set to the actual real location for the class. Both are aliases, so they generate the warning.

Your goal is to document one "nice API module having definitions that are actually defined in different deeper modules". Sphinx does this well and I see no bug here.

The example you provided (thank you a lot for this very nice issue by the way), exposes two "nice API module having the same definitions in deeper modules". While I think that this might be a rather exotic use-case, I do not find it wrong in itself.

In case @tk0miya or an other maintainer believe that having multiple "nice API module having the same definitions in deeper modules" is a valid use-case that Sphinx should support, keying the ObjectEntry by node_id instead of by name in the sphinx.domains.python.PythonDomain.objects dictionary could be a part of the solution. That would also possibly require to review slightly the xref resolution algorithm.

Conclusion, in my opinion Your use-case is quite exotic, but not completely invalid. Sphinx behave as expected (I think) but it looks like it would be doable to adapt the Python domain to also cover your use-case.


Appendix: about the "canonical" feature:

Appendix: content of generated RestructuredText files by sphinx_automodapi

api/duplicated.Foo.rst:

Foo
===

.. currentmodule:: duplicated

.. autoclass:: Foo
   :show-inheritance:

api/duplicated.foo.Foo.rst:

Foo
===

.. currentmodule:: duplicated.foo

.. autoclass:: Foo
   :show-inheritance:
saimn commented 2 years ago

Hi @cblegare, Thanks for the helpful answer!

The example you provided, exposes two "nice API module having the same definitions in deeper modules"

Ok so this explains that. It may not be a widespread usage but I think there are good reasons to do this. Typically we have quite crowded top-level namespaces (astropy.coordinates for example, https://docs.astropy.org/en/latest/coordinates/#reference-api) and we want document a subset of the functions/classes in a specific doc section, e.g. https://docs.astropy.org/en/latest/coordinates/#built-in-frame-classes Another example is astropy.units (https://docs.astropy.org/en/latest/units/#module-astropy.units) for which various sub-packages are documented in specific sections. And interestingly for this one, it seems only the sub-sub-packages are raising the duplication warnings (astropy.units.function.logarithmic for example, https://docs.astropy.org/en/latest/units/#module-astropy.units.function.logarithmic), whereas the sub-packages (astropy.units.function, astropy.units.quantity etc.) are not causing warnings.

cblegare commented 2 years ago

Well, that is indeed a very rich API!

And interestingly for this one, it seems only the sub-sub-packages are raising the duplication warnings

This is because when you define (again!) you API in intermediate namespaces, you give your objects new names. But their canonical name (that is the real fully qualified importable path where the object is actually defined in the codebase) is the same as the top-level crowded namespace. The canonical names are colliding (more on this in the Remark below).

I see four concerns here

  1. What you want your user to use in their import statements for convenience
  2. Where the objects are actually defined in astropy codebase
  3. How you want to structure your API documentation, with some areas of the API having their own dedicated sections
  4. How to help the reader navigate the documentation with a some more holistic views.

In the list above, I didn't add "enable the user to import a given object from the convenient crowded namespace, the actual deep definition of the object and a few more places in-between". I don't think you really want that. But by the way you are using sphinx_automodapi, you are indeed exposing your API at multiple levels in addition to the convenient one and the "real" canonical one. This might be induced by the fact that the way you use sphinx_automodapi mixes together two side-effects: give an holistic view and expose the API in a convenient but crowded namespace. Sometimes you want to have a more local, or narrow, but still holistic, view of a given region of the API (say... units). That does not mean you want to also expose these APIs in yet another namespace. The screenshot below from you general index shows that LogUnit is advertised as exposed in three (3) different namespaces.

image

💡 Remark: Using the index to inspect the state of the API is a good cue: cross-references to a given object (say LogUnit) needs to be resolved to one and only one target unless warning are emitted. This is totally normal and expected: Sphinx can only generate an hyperlink to one destination. This means that a given object in a documentation should have one and only one possible cross-reference target. This is true for domain object (such as Python classes) but also for any documented Sphinx object. How this work with canonical names you ask? When I x-ref :class:`astropy.unit.function.logarithmic.LogUnit`, Sphinx knows this is a canonical name to astropy.unit.LogUnit and will generate a target to astropy.unit.LogUnit. If two objects have the same canonical name, which one should Sphinx choose to generate the target?

While Sphinx supports having both a "real" name (canonical, deep where the definition is in the code) and one alias (usually higher in a crowded namespace), it does not support (yet?) having multiple aliases. While I completely understand your hunger to have a fine control on how to structure documentation with varied levels of granularity, I am not sure that advertising the same API in more than two (2) namespaces (the canonical and the convenient one) is desirable.

Given the example above, I feel you want to (and Sphinx supports to)

To enable this, I suggest to pick one from three solutions

  1. Change the way you use sphinx_automodapi. The issue you have is exactly the same that made me stop using apidoc: the need to narrow some API documentation within a consistent context. Nowadays I write a bit more Sphinx directives manually (less automatically), but it pays off in the end.
  2. Enable sphinx_automodapi to add the :noindex: option sometimes. This way, you can still use the extension but the newly documented objects are not added to the index, thus not exactly cross-referenceable and would not trigger any warning.
  3. Find a way to change the Python domain to make this work and submit a PR But this would be a backward incompatible change, if even possible to implement without breaking badly many documentation out there.

    On this matter, I retract my comment above that it seems it would be doable: I now think it barely look feasible with important drawbacks. This would probably involve to find a way to tell Sphinx that a given canonical name associated with multiple names knows about a primary or a default name to pick when resolving xrefs.

saimn commented 2 years ago

Thanks @cblegare for the really clear and detailed explanation, now things make much more sense :). We will discuss that with the other Astropy maintainers and see what option we can choose.