mkdocstrings / crystal

📘 Crystal language doc generator for https://github.com/mkdocstrings/mkdocstrings
https://mkdocstrings.github.io/crystal
MIT License
28 stars 3 forks source link

Trim anchors #8

Open Blacksmoke16 opened 9 months ago

Blacksmoke16 commented 9 months ago

Currently when you permalink to a specific method within a type, the anchor includes the FQN of the type itself. Such that you end up with really long redundant URLs. E.g. https://athenaframework.org/Framework/Controller/ValueResolvers/RequestAttribute/#Athena::Framework::Controller::ValueResolvers::RequestAttribute#initialize.

Given the anchors are already isolated to the page itself, would it not be enough to just sub out the type name from the id of the method (and related types)? E.g. something like:

id=def.abs_id | replace("%s#" % obj.abs_id, '') | replace("%s." % obj.abs_id, '')

Where I updated the method to be referenced as def such that the type can still be accessed via obj.

oprypin commented 9 months ago

Given the anchors are already isolated to the page itself

They are not, though.. just that every user happens to use it this way

oprypin commented 9 months ago

Relevant ongoing work:

pawamoy commented 9 months ago

Yep, with the changes in the mentioned PR, it could be possible to change the anchor of objects while retaining the complete identifier for cross-refs.

Blacksmoke16 commented 9 months ago

Gotcha, I guess I shall keep an eye on future releases then! Thanks!

oprypin commented 9 months ago
id=def.abs_id | replace("%s#" % obj.abs_id, '') | replace("%s." % obj.abs_id, '')

Hm so this is nice and all, but that makes it impossible to distinguish class methods and instance methods.

This would be some solution to distinguish them, although still looking rather weird:

Because I think you're suggesting basically

Blacksmoke16 commented 9 months ago

Easiest fix would be just do what the stdlib doc generator docs and do something like:

id=def.abs_id | replace("%s#" % obj.abs_id, 'instance-') | replace("%s." % obj.abs_id, 'class-')

Then you'd end up with Foo/#instance-bar and Foo/#class-bar which is probably good enough?

EDIT: Or add it at the end 🤷. Either or.

oprypin commented 9 months ago

@pawamoy Oof this one may be harder than predicted

If I have a heading with this id: Foo::Bar--usage-notes

Then I want this to remain the global identifier, but the actual local id to become: usage-notes

So I need to suppress IdPrependingTreeprocessor but still somehow not suppress it for the purpose of making this the global identifier for autorefs.

pawamoy commented 9 months ago

I'm not sure to understand. IIUC, this is what we currently have:

Docstring summary.

## Usage notes

[Link to usage notes](#usage-notes).
<p>Docstring summary.</p>

<h4 id="Foo::Bar--usage-notes">Usage notes</h4>

<p><a href="#Foo::Bar--usage-notes">Link to usage notes</a></p>
oprypin commented 9 months ago

Right but the request is to have it become

<h4 id="usage-notes">Usage notes</h4>

but the need to have a globally linkable identifier still remains

pawamoy commented 9 months ago

I'm not sure a subheading (a heading in a docstring) can ever retain just its slug as id (usage-notes), because the shortest id we can give to its parent (the object using this docstring, Foo::Bar) is the last component of the object FQN (Bar), so the shortest id for a subheading would at least be prefixed with that name (Bar--usage-notes).

Unless we're documenting Foo::Bar on a page of its own? :thinking: Then there's no heading for the object itself, and its docstring headings stay the same. Hmmm.

oprypin commented 9 months ago

The parent's id is not so important, sure it needs to be at least Bar. But all its children don't need to contain even that.

The point that this issue is trying to make is that this is redundant:

/Foo/Bar/#Foo::Bar--usage-notes (subheading)

/Foo/Bar/#Foo::Bar.some_method (child)

oprypin commented 9 months ago

Yes the point is about the usage where the item is documented on its own on the page. As much as we say that it's not necessarily the case, I'm guessing 95%+ of usages are like that

pawamoy commented 9 months ago

Yeah, right. Then we must add aliases with FQN above headings, so that autorefs registers those and redirects to the headings with short ids. So I think the two mkdocstrings filters do_heading and do_convert_markdown must take an additional argument for the short name (in addition to the FQN). And this short name can probably be inferred from the current page's path (or somehow computed given some additional user configuration).

oprypin commented 9 months ago

There's also the really nasty case

/Foo/Bar/#Foo::Bar--usage-notes wants to become #usage-notes but /Foo/Bar/#Foo::Bar.some_method--usage-notes probably wants to become #some_method--usage-notes 😵‍💫 while still retaining the differently prefixed global identifier

oprypin commented 9 months ago

Alternatively we could say that people who are so certain about the page paths should just forfeit the autorefs syntax and use normal links (absolute links will work properly in the next mkdocs release)

pawamoy commented 9 months ago

Hmmm, I think it's feasible. If we pass both the FQN and the current path (left part of the FQN down to where we are in terms of page tree): FQN=Foo::Bar, CurrentQN=Foo::Bar, then we can compute the short names relative to the current qualified name, i.e. Foo::Bar.some_method--usage-notes gives some_method--usage-notes. It seems to become dependent on the language though.

<p>Docstring summary.</p>

<!-- Foo::Bar - Foo::Bar = "" -->
<a id="Foo::Bar--usages-notes"></a>
<h4 id="usage-notes">Usage notes</h4>

<p><a href="#usage-notes">Link to usage notes</a></p>

<!-- Foo::Bar.some_method - Foo::Bar = some_method -->
<a id="Foo::Bar.some_method"></a>
<h4 id="some_method">some method</h4>

<p>Some method summary.</p>

<a id="Foo::Bar.some_method--usages-notes"></a>
<h4 id="some_method--usage-notes">Usage notes</h4>

<p><a href="#some_methhod--usage-notes">Link to usage notes</a></p>
oprypin commented 9 months ago

Aha. It's definitely doable somehow and I was almost able to finish coding all this, but it just becomes super complicated

pawamoy commented 9 months ago

That's what I meant by "a bit more work" in the autorefs PR hahah

pawamoy commented 9 months ago

I don't think it's that complicated. Computing names relative to other names seems however to depend on the language (different separators), so this would need some kind of hook between mkdocstrings and the handlers... and yeah that part complicates things.

Blacksmoke16 commented 9 months ago

Alternatively we could say that people who are so certain about the page paths should just forfeit the autorefs syntax and use normal links

I will say I'd be okay with this. For my specific use case I think I'd have to do this anyway to link something from one component to another, unless something is done to integrate https://squidfunk.github.io/mkdocs-material/plugins/projects/.

pawamoy commented 9 months ago

this would need some kind of hook between mkdocstrings and the handlers

Actually if handlers can expose a path components on objects (string tuple), no need for a hook:

If we can infer we're down to ("Foo", "Bar") (for example because current page ends with Foo/Bar), then we can compute the relative path of Foo::Bar.some_method to be some_method.

This would be language-agnostic.

oprypin commented 9 months ago

I think the main difficulty is still here

https://github.com/mkdocstrings/autorefs/blob/95d32b492bf038fff6974addc2d572582a37dadd/src/mkdocs_autorefs/plugin.py#L187

We are still relying on "autorefs" to scan the headings, and it is hardcoded to look at id of headings, which can't easily be changed because this info is obtained from toc_tokens.

pawamoy commented 9 months ago

Argh, my brain is exploding.

IIUC, headings in docstrings are solved (see previous comments, and thanks to the fact that the autorefs extension also runs when converting docstrings from markdown to html). The remaining issue is for headings generated by mkdocstrings itself with its do_heading Jinja filter, because we pass these headings through toc to retrieve them in on_page_content, calling map_urls on them. If we were to add anchors above headings in our Jinja templates, they would simply be ignored because the generated HTML is stashed and autorefs won't operate on it.

So, one solution would be to somehow call autorefs' register methods directly from our do_heading filter? :thinking:

pawamoy commented 6 months ago

We are still relying on "autorefs" to scan the headings, and it is hardcoded to look at id of headings, which can't easily be changed because this info is obtained from toc_tokens.

After taking a good, serious look at everything again, it turns out that the on_page_content hook of the autorefs plugin, where toc items are iterated on to register anchors (same link as above), is only meant for headings written directly in Markdown pages, and not for headings generated by the do_heading filter, nor for subheadings appearing in docstrings.

These latter headings (API object headings and their subheadings in docstrings) are registered directly from within our mkdocstrings outer extension, in AutoDocProcessor.run. We don't explicitly pass them through toc, so it should be easy to add the necessary info to make this work. For subheadings in docstrings, they do pass through toc to gain an id, then our id prepending processor prepends the current object id to these ids. The prefix is the HTML id we pass to convert_markdown.

Note that autorefs in full-mode (listed in mkdocs.yml plugins) will see these headings (object headings and their docstring subheadings) and register them again. This plays a role in what follows: if we implement custom (trimmed) anchors for object headings and their subheadings by changing their HTML id and providing "aliases", autorefs in full-mode will register the aliases too, which we don't want.

Three examples.

Example 1, with root object heading:

This will cause issues if there are other non-API-object baz headings throughout the site.

Example 2, with root object sub-heading:

This will cause issues too if there are other non-API-object qux headings throughout the site. This one is more problematic because text headings are more likely to be reused.

Example 3, with sub-object sub-heading:

This one is less problematic because users are not supposed to try and cross-reference something using the short baz--qux version, and equivalent plain text headings # baz--qux would have their id slugified as baz-qux (single dash) anyway. In this case, autorefs would simply register anchors that are never used or usable (simply put, harmless noise in the URL map).

Conclusion: we have to find a way to prevent autorefs from registering short anchors again.


For the implementation itself of trimmed anchors: