At-ref proposal with InterSphinx compatibility

goerz commented 11 months ago

I would propose that we add support for external @ref references in a way that is fully compatible with InterSphinx (part of the Sphinx framework that is the Python-equivalent of Documenter). Sphinx is used by virtually all Python projects (including Python itself), as well as some C/C++/Fortran projects.

This provides a solution for #688 and #425.

It may also address #319 and #1343.

The proposal has two parts:

Every call of makedocs creates an objects.inv file in the build folder that contains a mapping of all symbols in the project to a relative URL. This "inventory" file gets deployed alongside the rest of the build folder. Thus, it will be downloadable as, e.g., https://documenter.juliadocs.org/stable/objects.inv, https://documenter.juliadocs.org/v1.1/objects.inv, etc. See below for the format of the objects.inv file.
A plugin, DocumenterInterlinks (or some other name) provides the ability for a project to link to any other documentation. For example, DocumenterCitations might want to link to the main Julia docs and the Documenter docs. It would set that up as follows in docs/make.jl:

# make.jl file of DocumenterCitations
# ...
using DocumenterInterlinks

links = DocumenterInterlinks(
    "julia" => "https://docs.julialang.org/en/v1.9/",
    "Documenter" => "https://documenter.juliadocs.org/stable/",
    "matplotlib" => "https://matplotlib.org/stable",  # just for extra fun (see below)
)

makedocs(plugins=[links, ], …)

Assuming that both the Julia documentation and the Documenter documentation were built such that https://docs.julialang.org/en/v1.9/objects.inv and https://documenter.juliadocs.org/stable/objects.inv exists, this would enable to write Markdown like the following the DocumenterCitations documentation:

... sorting is done via the [`sort`](@ref Base.sort) routine ...

Pass the `plugins` to [`Documenter.makedocs`](@ref).

When building DocumenterCitations' documentation, these two references would ordinarily not be resolvable. With the DocumenterInterlinks plugin, for any reference that cannot be resolved locally, the plugin would go through all the external references ("julia", "Documenter") that were set up above, download the objects.inv file, and resolve to the link defined here. See below for details.

In principle, all of the above would work independent of the format of the objects.inv file (or inventory.toml, or any other name). I would propose though to adopt the exact format defined by InterSphinx, as detailed below. First, the format is fundamentally suitable (not surprisingly, since Sphinx and Documenter are quite similar). Second, it would allow interoperability with Sphinx: Julia projects could link to Python projects (including the Python manual). For example, the PythonPlot.jl documentation could link directly to the matplotlib documentation. I've also used Sphinx for a large Fortran project of which QuantumControl.jl is a continuation, and I would certainly be interested in being able to link from the QuantumControl.jl documentation to the Fortran QDYN documentation. The other way around would work too, of course: Python projects (or anything using Sphinx) could link directly to Julia documentation.

All of this is a bit gratuitous for the original problem #688 of just linking between Julia projects, but since we have to design some kind of inventory file anyway, simply adopting the InterSphinx format would give us a lot of interoperability basically "for free".

The links set up in the configuration may choose to link to something like a stable target, to a specific version. The former runs the risk of links breaking when a new version of the target package is released, but it's up to the user what they want. In any case, this is strictly better than the current solution of explicit URLs:

Pass the `plugins` to [`Documenter.makedocs`](https://documenter.juliadocs.org/stable/lib/public/#Documenter.makedocs).

Now, if Documenter 2.0 came out without a makedocs function, I'd no longer have to manually fix all the links. Instead, I could just change the stable to v1 in a single location in my make.jl file.

Lastly, I would stress that creating and deploying the objects.inv should be part of Documenter itself, not of the DocumenterInterlinks plugin: We'd want to be able to link to any Documenter-based documentation, without the authors having to opt-in via a plugin. An objects.inv file (in any format) would also solve other problems. For example, there has been some discussion (@pfitzseb) about VS Code indexing the docstrings of all installed packages. If these docstrings link to other parts of the documentation, an objects.inv could be used to resolve these links without VS Code needing to ingest every package's full documentation.

Creating an Inventory File

Documenter already keeps track of all reference targets (docstrings, headers) in a project. It would simply have to dump these into an objects.inv file in a format compatible with InterSphinx. This format has been reverse-engineered here: https://sphobjinv.readthedocs.io/en/stable/syntax.html

Basically, the objects.inv is a compressed list of plain text lines with the structure (parsed by a regex)

{name} {domain}:{role} {priority} {uri} {dispname}

Documenter would set the individual fields as follows:

name: For a docstring, this would be the fully qualified name, e.g. Documenter.makedocs. For headers, it should be the header ID, as described in "Duplicate Headers", or an auto-generated ID (the "slug") if there is no specified ID.
domain would always be jl. This matches the choice in the Sphinx-Julia domain. While that project seems to be inactive, their choices are sensible, and we should adopt them.
role should be one of type, abstract, mod, func for docstrings, depending on what kind of object the docstring is for. Again, this is taken from the Sphinx-Julia domain. This isn't really relevant for linking between Julia projects, since Documenter doesn't differentiate between
```
[`makedocs`](@ref Documenter.makedocs) (link to a function)
[`Documenter.Document`](@ref) (link to a type)
```
but for Sphinx projects linking to a Documenter project, this would come into play, and the relevant reStructuredText syntax would be
```
:jl:func:`makedocs <Documenter.makedocs>` (link to a function)
:jl:type:`Documenter.Document` (link to a type)
```
For headers, role would always be ref.
priority would be 1 (not relevant to Documenter/InterSphinx)
uri would be the URL to access the resource, relative to the objects.inv file. For example, for name=makedocs in Documenter's objects.inv it would be lib/public/#Documenter.makedocs
dispname would be the same as name for docstrings, and the full string of the section title for sections

The `DocumenterInterlinks` extension

The extension would be opt-in, enabling a project to link out to any other Documenter or Sphinx project. At a minimum, this would handle all references that cannot be resolved locally, and go through all the external resources defined when instantiating the plugin to find one that can resolve the reference.

If we want to go further, we could also extend Documenter's @ref syntax to enable some features comparable to that of InterSphinx, see the external role in Sphinx

All of the following would be valid links in Documenter:

* [:external:`makedocs`](@ref)  (don't try to find `makedocs` locally)
* [:external+julia:`sort`](@ref)  (link specifically to the `sort` in the Julia
  manual, cf. for the label "julia" when instantiating `DocumenterInterlinks`)
* [:py:func:`matplotlib.pyplot.subplots`](@ref) (link to a function in the
  Python `matplotlib` project's Sphinx documentation)
* [`subplots`])(@ref :external+matplotlib:py:func:`matplotlib.pyplot.subplots`)

I'd consider these features optional, although without them, there may be a risk for "collisions" of different packages having references for the same name. The first one found wins, but that may make it impossible to generate certain links. Nonetheless, I would start any implementation of a DocumenterInterlinks plugin to just handle otherwise unresolved references and add fancy new features only later.

I would also be open to the possibility of all external links having to use @extref instead of @ref. That is, the DocumenterInterlinks could provide an entirely new command instead of overloading the existing @ref.

goerz commented 10 months ago

The prototype for this proposal is starting to shape up nicely at https://github.com/JuliaDocs/DocInventories.jl and https://github.com/JuliaDocs/DocumenterInterLinks.jl.

I will expand this issue with further points for discussion once we get the documentation for these two new packages deployed.

goerz commented 10 months ago

The DocumenterInterLinks prototype is fully functional now. See the documentation at http://juliadocs.org/DocumenterInterLinks.jl/stable/. Loading the package in docs/make.jl makes Documenter write out an objects.inv (Sphinx) inventory file, as well as an inventory.toml.gz file (my own format, see below). It also allows to instantiate an InterLinks plugin object that allows to load inventory files generated by Sphinx and Documenter/DocumenterInterLinks

A few updates on the original proposal and further points of discussion for the next community meeting.

`@extref` syntax

It turns out that it isn't really feasible for a plugin to hook itself into the process of resolving @ref links. Thus, linking to external targets happens with a new @extref syntax entirely handled by the plugin. See the Syntax description; typical examples are

[Basic Markdown](@extref Documenter)
[`Documenter.makedocs`](@extref)
[Documenter's `makedocs` function](@extref `Documenter.makedocs`)
[`Documenter.parseblock`](@extref `Documenter.parseblock-Tuple{AbstractString, Any, Any}`)
[Home page of the Documenter documentation](@extref Documenter :doc:`index`)
See the section about Documenter's [Writers](@extref Documenter)

This syntax is considerably nicer than my first proposal based on the assumption we'd be overloading @ref. I've been using it quite extensively over the last couple of weeks in the documentation of DocumenterInterLinks and DocInventories as well as some of my other packages, and found it very pleasant and intuitive to use.

It would still be nice if the plugin could also hook into the resolution of @ref. The idea would be that any @ref that cannot be resolved would be retried after replacing @ref with @extref. This would be useful if I want to include a docstring from another package in my documentation, and that docstring has @ref links to other parts of that package's documentation.

This would require redesigning how Documenter resolves @ref links and replacing the current implementation with a pipeline where plugins could insert steps into. I think this would be worthwhile, but we don't have to do it right now, and I'll open another issue for that.

Writing Inventories

Writing inventory files is currently implemented in https://github.com/JuliaDocs/DocumenterInterLinks.jl/blob/master/src/write_inventory.jl.

I very strongly feel that some variation of this code should be ported into Documenter. Inventories become infinitely more useful when every project has one (just like every Python project has an objects.inv inventory, even if they don't use the Intersphinx plugin to link to other projects).

Also, the code in write_inventory.jl (necessarily) uses very internal parts of Documenter – more internal than even a plugin should, in good conscience.

The only open question would be which inventory file format we should write. The options are:

An objects.inv file, which is the format that Sphinx uses.

The advantage is the small size and that it can be read directly by Sphinx (making it easier for Python projects to link to Julia projects, with some caveats, see below)

The disadvantages (apart from "not-invented-here") are that the format isn't very human-friendly, although I've written DocInventories for working with inventories in the REPL
An inventory.toml.gz file, which is a new format I've come up with (and whose details are still up for discussion)

The advantages are:
- TOML is widely used in Julia, so it's a familiar format
- Very human-friendly (after decompression, which most people will be able to handle)
Disadvantages:
- Bigger than objects.inv, but not by much.
- Can't be read by Sphinx out of the box (although a small Python package could be written to fix that)
Write both, which is what the code in write_inventory.jl currently does. Or we could even write an uncompressed inventory.toml as a third file.

I don't have a strong preference. My first inclination was to write an objects.inv file to make everything compatible with Sphinx. Alas, Sphinx won't be able to deal with the Julia domain I've defined, requiring extra code on the Python side. At that point, one might as well implement a Python reader for inventory.toml.gz. Not writing an objects.inv file would also make the InterLinks instantiation a little bit more cumbersome, in that the nice project => root_url mapping would no longer work for referencing Python projects.

The code in write_inventory.jl is written to integrate easily into Documenter. If it was part of Documenter, the write_inventory function would probably be just be called as part of the HTMLWriter.render function, not in a separate pipeline step.

The code has no dependencies (not even on any other part of DocumenterInterLinks/DocInventories) other than CodecZlib. CodecZlib will have to become a dependency of Documenter. I don't think there's a way around that: the inventory files really should be compressed (they get pulled a lot, every time a documentation using DocumenterInterLinks builds). It seems like a very stable dependency, though, so not something I'd worry about.

bskinn commented 9 months ago

👋 sphobjinv author here, discovered this issue vanity-searching the project. 😅 Glad the objects.inv syntax writeup was helpful for you!

I'd like to point out one technical detail that might affect your implementation of lookup of an objects.inv for an external project. You wrote:

uri would be the URL to access the resource, relative to the objects.inv file.

It's not strictly true that uri will always be relative to the objects.inv. That's how the Sphinx html builder lays out the directory structure by default, but there's nothing stopping docs publishers from moving the inventory somewhere else. Django is the primary example of this that I know of:

The root URL for the uris is, e.g., https://docs.djangoproject.com/en/dev/
But, the inventory there is accessed at https://docs.djangoproject.com/en/dev/_objects/

In most cases, it's safe to assume that the inventory will live at {uri root}/objects.inv, but I'd be leery of hard-coding it that way.

goerz commented 9 months ago

@bskinn Thanks for checking in, and thanks for sphobjinv! It has been exceedingly helpful!

You are absolutely correct about the relative URLs; that was badly phrased in the original https://github.com/JuliaDocs/Documenter.jl/issues/2366#issue-2017224637. In DocumenterInterLinks, the URLs are indeed relative to an explicit root URL, independent of where the inventory file is read from. The InterLinks declaration is (deliberately) almost identical to Python's intersphinx_mapping.

bskinn commented 9 months ago

Terrific, sounds like you had it firmly in hand already.

Thanks for the kind words, very glad it was helpful, and thanks for all you & the team do!

mortenpi commented 9 months ago

@goerz Apologies for the delay. But I finally found the bandwidth to go through this and #2424. And, actually, I don't have much to say -- LGTM! The immediate necessary changes to Documenter (#2424) are pretty minimal and don't change any behavior.

I think I have some opinions / question marks on the link resolving side (what DocumenterInterLinks does), but the writing of the inventory side of things seems very straightforward.

My two cents on the inventory files: we should probably pick one. I'm somewhat partial to the existing InterSphinx format, simply because it's some sort of an existing "standard" (even though I like TOML better as the syntax). But maybe let's discuss the pros and cons in more detail at the community call, and make a decision there.

JuliaDocs / Documenter.jl