facebook / docusaurus

Easy to maintain open source documentation websites.
https://docusaurus.io
MIT License
56.99k stars 8.58k forks source link

Access to docs metadata from individual doc pages #6302

Closed Danielku15 closed 2 years ago

Danielku15 commented 2 years ago

Have you read the Contributing Guidelines on issues?

Description

In many occasions some docs authors want to generate dynamic pages based on the metadata of other pages. Imagine that you describe one property/method of your library per file in a structure like /docs/reference/{propname}.mdx

In this MDX you use the metadata to describe some basics of your property (type, default value etc.) From this folder you want to be able to generate an overview table listing all properties of your library. Take https://datatables.net/reference/option/ as a reference on that.

Currently Docusaurus does not provide a way to query all docs pages from the /docs/reference folder and access their metadata for dynamically generating content on a /docs/reference/index.mdx

I am opening this feature proposal as @slorber proposed to rather discuss here on the issues or Discord on this feature. So I tried to make a concrete proposal.

Has this been requested on Canny?

https://docusaurus.io/feature-requests/p/access-to-docs-metadata-from-individual-doc-pages

Motivation

With access to the site metadata from any page (including listing available pages and then loading any metadata information) you can build a wide range of advanced components enriching the documentation experience for users with keeping the maintenance effort low. This includes:

API design

The access to the site metadata should be exposed by a special module which can be imported to any MDX or JS file. This module exports various APIs to get access to the pages and their related metadata.

The data model should be similar to the internals of the docs plugin: https://github.com/facebook/docusaurus/blob/main/packages/docusaurus-plugin-content-docs/src/types.ts

When it comes to the API it is a matter of how many convenience we want to add to devs. The most basic variant would be to have one single export const loadedContent: LoadedContent; which simply contains all information. Developers would need to scan themselves this tree to find the pages they are interested into.

The next level of convenience is then to provide some filtered versions of loadedContent like:

The good thing is: the docs plugin should have the majority of the data already available, it is used in various ways within the plugin. It is "just" a matter of making it available for the documentation modules.

Also note the example implementation below, it might give some insights into the usage of such a system.

Have you tried building it?

Yes, long time ago, when I migrated my website to docusaurus I made a very basic version of my needs. It does not support well all parts like incremental compilation but it works. The changes were added on this fork/branch: https://github.com/Danielku15/docusaurus/tree/feature/docs-with-pageapi

This PR shows the changes (respective to the old branch) https://github.com/Danielku15/docusaurus/pull/1/files

The page metadata is made available to the modules as virtual module during compilation. To be in-line with the practices of Docusaurus the whole docs metadata is flushed as JSON to the disk. This is happening here:

https://github.com/Danielku15/docusaurus/pull/1/files#diff-feafd15ba7ba518ede1488b414118b26c572ff1b9f3cd5625406980490d95652R348

Then on the WebPack compilation I register a virtual module with the VirtualModulesPlugin which makes the raw data from the JSON available through a @docusaurus-meta/docs module. This is happening here:

https://github.com/Danielku15/docusaurus/pull/1/files#diff-feafd15ba7ba518ede1488b414118b26c572ff1b9f3cd5625406980490d95652R418

Additionally there are some adaptions here to the types and here to some data mappings. This is needed to allow developers specifying any custom site metadata and not only the built-in ones.

  1. I create many pages in a tree which have metadata defined like this.
  2. I created a table component which loads all pages with a given base path. The pages with their metadata are loaded through some custom helper on top of the virtual module.
  3. The table is then used like this.
  4. And visually the result looks like this.

For the real implementation the virtual module handling likely needs to be improved to handle better incremental compilation. I am not sure how docusaurus handles the compilation pipeline in the latest version, but we should try to rely on already existing files which are generated. Maybe webpack has some good built-in features to provide the JSONs from a whole directory to modules.

Side note: The PR/Fork contains a second feature which allows me to virtually put the subpages into a side bar without the pages being explicitly added to one. On each page I can put a redirectSidebarToRoute: /docs/other/page metadata and on this page the sidebar would be rendered just like if I am on /docss/other/page right now. Without this feature, when the dynamic subpages are opened, the sidebar does not render properly. This can be seen when opening this page, the sidebar looks like I am on this page.

Regarding the question below: Yes I would be willing to contribute this feature, but I guess I might need some guidance on the current practices within Docusaurus (especially regarding the compilation pipeline and testing practices). Also I don't think I can contribute this directly within 7 days so I leave it unchecked for now.

Self-service

slorber commented 2 years ago

Thanks for the detailed writeup @Danielku15

Unfortunately, it is a big wall of text with many links. It is quite hard and time-consuming for me to process all this, and I'm still unsure what exactly you are trying to build here, as I can only understand one of the 3 provided examples.

I need first to have a perfect understanding of what you want to build exactly.

Instead of you providing a technical solution to this problem, I'd rather shift the conversation to something else: can you focus on describing the problem you are trying to solve in the first place?

What data do you want to show where exactly?

Please include as many screenshots as possible (or use https://excalidraw.com/) and provide live site URLs when relevant.

Please be very specific to your use case. Do not try to generalize the problem and interpolate this to the possible needs of other users, at least not for now.

If you have multiple use-cases and they are complex and/or slightly unrelated, let's move to GH discussions as it allows to handle multiple separate but related conversations better, see https://github.com/facebook/docusaurus/discussions/5468)


Possibly related, but can't be sure for now:

Danielku15 commented 2 years ago

@slorber Sure, I try to rephrase my request simpler and be less verbose for the start. The issue template tends to scare people off a bit with all the remarks that issues might get simply closed if you're not putting in enough effort describing your cases and situation. πŸ˜… I hope this round makes it easier for you to understand my needs 😁

What I need: Within any mdx or js file, I want to be able to programatically scan for all the docs pages in my website and use the metadata of these pages.

What needs extension for this: I need at least one JavaScript function allowing me to access the available docs pages and at least a second one to obtain the docs metadata for a given docs page, including custom information for my needs.

What I want to build with this: I am creating a lot of individual pages related to different aspects of the library I am developing (properties, data models, available functions,..). These pages are enriched with metadata to describe details like:

Using the pages and their metadata I then want to build overview tables listing all properties and methods with respective metadata from the pages like a technical Table of Content. Also I want to build a component which annotates cross references in my documentation with additional information.

Visual Examples with annotations image image image image

cassieevans commented 2 years ago

I'm also after exactly this!

What I need: Within any mdx or js file, I want to be able to programatically scan for all the docs pages in my website and use the metadata of these pages.

Simplified use case:

Currently I am creating pages for each property and method available, these pages have front matter containing more information about them - e.g. their parameters and a description

It would be ideal if in the parent's 'index' page I could loop through these subpages, access the front matter and create a table of all the properties and methods.

Screenshot 2022-02-02 at 13 20 29 Screenshot 2022-02-02 at 13 23 34

11ty have global data files (JSON) and directory specific data files which are incredibly handy for creating data that's accessible in many different md files - maybe there's something like this in docusaurus that I've missed?

https://www.11ty.dev/docs/data-template-dir/

slorber commented 2 years ago

@cassieevans I don't know much about Eleventy but I think we have solutions that may be quite similar.


First of all, Docusaurus is modular (plugins system), and we have 2 potential cases here:

I'd like to know for both of you if we are in the 1st or 2nd case (or both).


First case: docs data -> docs page

We have a concept of "category generated index".

It looks like cards by default but you can swizzle and replace the rendering of items with whatever you want.

https://docusaurus.io/docs/category/guides

If you write a category index page in markdown, you can embed this "list" inside that index page.

https://docusaurus.io/docs/sidebar/items#embedding-generated-index-in-doc-page

import DocCardList from '@theme/DocCardList';
import {useCurrentSidebarCategory} from '@docusaurus/theme-common';

In this section, we will introduce the following concepts:

<DocCardList items={useCurrentSidebarCategory().items}/>

The result looks similar to what you want, see in https://docusaurus.io/docs/sidebar

image

We only render the direct childs of the category, but you are free to create your own component and call useCurrentSidebarCategory() in it if you want.

Now I don't think arbitrary frontmatter will end up being available in this React hook, but that makes sense that we add this possibility IMHO, so it's worth opening a new dedicated issue to discuss this specifically

We have a concept of sidebarItem.customProps, so it probably make sense that by default we allow setting it through frontmatter like sidebar_customProps: {x, 3, y: 'hello'}

Let me know what you think.


Second case: docs data -> any page

For plugins, we a global data API

For site global data, we have config.customFields:

These data will be available from everywhere, all pages created by all plugins, including the site layout (navbar, footer etc...)

Warning: unlike the 1st solution, the data created this way is... global, so it means it will be loaded no matter which page you visit. Try to keep it small.

Now to expose docs data as global site/plugin data, you have different choices:

As explained in the other issue, we don't have yet a good API to "extend" a content plugin (https://github.com/facebook/docusaurus/issues/4138), but you can do something like this:

const docsPluginExports = require("@docusaurus/plugin-content-docs");

async function docsPluginEnhanced(...pluginArgs) {
  const docsPluginInstance = await docsPluginExports.default(...pluginArgs);

  return {
    // spread default docs lifecycles
    ...docsPluginInstance,

    // wrap/override the default loadContent lifecycle to provide additional logic
    contentLoaded: async function(params) {
      // execute default docs plugin behavior (ie create docs routes)
      await docsPluginInstance.contentLoaded(params);

      // Add your own extra behavior here
      params.actions.setGlobalData({allDocsFromLatestVersion: params.content.versions[0].docs}}
    }
  };
}

module.exports = {
  ...docsPluginExports,
  default: docsPluginEnhanced
};

(I'll need to double-check, I think setGlobalData might erase former global data)


Note: in any case you can also have a pre-start/build script that generates JSON files, and you can import data from './data.json' in any mdx file.

Similarly to above, you can also tell the docs plugin to generate such files in a loadContent lifecycle override


Let me know if those solutions solve your problems

cassieevans commented 2 years ago

Thanks so much for this detailed response @slorber!

It's really helped clear things up for me and give me a better idea of how this all slots together.

In answer to your question - I'm specifically needing use case 1) docs data -> docs page

I only need direct children so it looks like a category index page with a swizzled DocCardList will be ideal. However, I logged out the items out and can only access the following. (As you mentioned - no access to arbitrary frontmatter)

docId: "-"
href: "-"
label: "-"
type: "-"

Without access to the frontmatter of those child pages my hands are tied a little. If it's a possibility to add that functionality that would be amazing and hugely appreciated!

Should I open a new dedicated issue?

slorber commented 2 years ago

Great.

Yes it's better to open a separate dedicated issue IMHO as it's a more concrete use-case and @Danielku15 might have different needs.

Please try to provide an example of the frontmatter you want to use (and if using a different frontmatter name is ok?) A mockup of the UI you want to design can be helpful to understand the outcome you want to achieve

Danielku15 commented 2 years ago

First case: docs data -> docs page

I will need to give this one a try but I think there might be one feature missing to reach my goal. My thinking so far:

  1. I would use an Autogenerated sidebar to bring all my subpages into the sidebar.
  2. I then would define additional custom frontmatter in sidebar_custom_props. https://github.com/facebook/docusaurus/pull/6619/ should have brought this feature.
  3. Then I use useCurrentSidebarCategory().items to access all details and generate my table.

This only brings me to one problematic situation: All my subpages are suddenly in the sidebar which might be a lot!😨Just take this table as an example: https://alphatab.net/docs/reference/settings I wouldn't want them all to be actually in the sidebar. Maybe https://docusaurus.io/docs/sidebar/items#collapsible-categories could solve this?

I'll let you know if this would work out.

Second case: docs data -> any page

Thanks for the hints. I would need to rethink the documentation strategy in this case. But this might not be bad. I was anyhow thinking more to rather rely on my TSDoc as an input to my docs. But as this is anyhow rather a future idea of mine to improve my docs, I could also live without this for now. My intermediate goals is rather to get away from my custom fork to stay up to date with all deps and other great features of docusaurus 😁 .

slorber commented 2 years ago

@Danielku15

This only brings me to one problematic situation: All my subpages are suddenly in the sidebar which might be a lot!😨Just take this table as an example: alphatab.net/docs/reference/settings I wouldn't want them all to be actually in the sidebar. Maybe docusaurus.io/docs/sidebar/items#collapsible-categories could solve this?

Honestly, I don't really understand why these items couldn't be part of the sidebar πŸ€·β€β™‚οΈ you can definitively collapse the category by default.

There are many similar examples, like this one: https://supabase.com/docs/reference/dart/using-filters

IMHO it's also a good practice if you isolate the API ref documentation into a separate sidebar, as ref documentation is expected to be quite exhaustive (on purpose) compared to guides and tutorials.


Still proposing a solution that we could technically implement:

By default, the Docusaurus docs plugin will create a bundle for a given docs version. This bundle is shared between all docs, this is important so that each doc does not re-download a new JSON file on each navigation.

Currently it looks like this:

  export type PropVersionDoc = {
    id: string;
    title: string;
    description?: string;
    sidebar?: string;
  };

  export type PropVersionDocs = {
    [docId: string]: PropVersionDoc;
  };

  export type PropVersionMetadata = {
    pluginId: string;
    version: string;
    label: string;
    banner: VersionBanner | null;
    badge: boolean;
    className: string;
    isLast: boolean;
    docsSidebars: PropSidebars;
    docs: PropVersionDocs;
  };

When a doc is displayed, all those data have to be downloaded ahead of time so that React can render appropriately, so you should rather keep these data small.

We added recently sidebar_custom_props in https://github.com/facebook/docusaurus/pull/6619

Eventually we could add another frontmatter like custom_props and make it available on the PropVersionDoc type:

  export type PropVersionDoc = {
    id: string;
    title: string;
    description?: string;
    sidebar?: string;
+    customProps: Record<string,unknown>
  };

There's a hook to access this data. It's not officially documented as public API but I think it makes sense that we expose it officially someday, that doesn't seem too risky to rely on it.

import {useDocsVersion} from '@docusaurus/theme-common';

Note that you won't access "docs metadata" (because those can be too large by default to put in this data bundle) but only your custom props, so it might require you to duplicate a bit some frontmatter data.

Would that solve your use case?

Danielku15 commented 2 years ago

I got quite far lately with my realization (based on #6619) and it looks quite good already. I just didn't manage yet to share the insights and you were a bit faster than me with responding again 😁

Honestly, I don't really understand why these items couldn't be part of the sidebar πŸ€·β€β™‚οΈ you can definitively collapse the category by default.

These would be often scales > 100 items we would put into the sidebar. It is more a personal thing as library author that I don't want to bloat my sidebars like this. For some cases it might not be such a problem, but the primary usecase for me is to use it for API and data model reference documentation. There are mixed practices established across library authors and everyone does it a bit different: not having the subitems in the navigation, dynamically changing sidebars, expanders for all hierarchies (with different levels of auto-expansion).

I personally don't like this bloat in the sidebar. But it is mostly a personal preference a library and docs author.

But also: I solved this challenge with a rather simple approach: I set collapsible to false, and add a className: 'referenceApi' to these special menu items. Then a simple li.referenceApi ul { display: none; } hides the whole subitems from the sidebarπŸŽ‰ The DOM might be still bloated but it makes everything work and displayed as I like.

Still proposing a solution that we could technically implement:

It would be nice if we have one custom property bag which we can make the details available to both the category and other pages. If we extend PropVersionDoc it should maybe rather go for a single custom_props and make it available in useDocById/useDocsVersion and useCurrentSidebarCategory().items equally and dropping the separate sidebar_custom_props.

Another "not-so-nice" thing is that we have multiple models exposing the same data but with different object properties making the usage a bit harder from different pages in a shared React Component. In the frontmatter variable of the page itself, we would use frontmatter.title to access the title. With useCurrentSidebarCategory().items we need to go for the useCurrentSidebarCategory().items[0].label. PropVersionDoc also has again .title. But again: not a blocker πŸ˜‰


Beside that I found some small obstacles I may report. Maybe these can be addressed in smaller improvements/features. One example:

The Next/Previous paginator on the bottom, will be based on the lowest hierarchy level. I would rather prefer if the next/previous buttons are pointing to the siblings of the category page.

But I plan to consolidate these findings first and try to find other solutions before really requesting a change πŸ˜‰

slorber commented 2 years ago

It would be nice if we have one custom property bag which we can make the details available to both the category and other pages. If we extend PropVersionDoc it should maybe rather go for a single custom_props and make it available in useDocById/useDocsVersion and useCurrentSidebarCategory().items equally and dropping the separate sidebar_custom_props.

The thing is you already have a property bag with frontmatter, it's just that this bag is on only exposed to the doc that declares it.

What you want here is a "shared" bag for the doc that would be shared across all the docs pages of a given version. Not so sure custom_props would be a good name then as it does not really convey that this data is "shared".

useDocById/useDocsVersion already read the shared data.

useCurrentSidebarCategory read from the sidebar data, which is another bag. We can't really put the data you want there because a doc may not have a sidebar. We really need 2 bags. Eventually we could expose a hook that read both bags and expose a unified model but I'm not sure it's a good idea.

Another "not-so-nice" thing is that we have multiple models exposing the same data but with different object properties making the usage a bit harder from different pages in a shared React Component. In the frontmatter variable of the page itself, we would use frontmatter.title to access the title. With useCurrentSidebarCategory().items we need to go for the useCurrentSidebarCategory().items[0].label. PropVersionDoc also has again .title. But again: not a blocker πŸ˜‰

frontMatter.title is only the title declared through frontmatter. Some docs may have an undefined value because they declare doc title using a markdown title # title so you should not really rely on this to get a title unless you consistently declare titles through frontmatter.

Yes there are multiple data models but somehow this is expected. We could use a same big fat model everywhere, but that would lead to "overfetching" (somehow the same problem solved by GraphQL): larger data bundles and worst performances for all docs pages.

When implementing an app, it's quite usual to have a different model for the list view and the detail view. You don't display all the data on the list view.

slorber commented 2 years ago

The Next/Previous paginator on the bottom, will be based on the lowest hierarchy level. I would rather prefer if the next/previous buttons are pointing to the siblings of the category page.

We have new frontmatter to control pagination so you should be able to achieve what you need?

Danielku15 commented 2 years ago

What you want here is a "shared" bag for the doc that would be shared across all the docs pages of a given version.

Depends on the right definition of "shared" in this context. The term might be a bit ambiguous: I do not want to have the "same" object/bag across all pages but the bag of each page should be shared with others for reading. 😁

e.g. /docs/api/method1 has since: 1.0.0 and /docs/api/method2 has since: 1.2.0 in their custom_props. If I show data for method1 (category or on other pages in tooltips) I want to display a [1.0.0] badge and for method2 a [1.2.0] badge.

One of my goals is to add tooltips to my docs like github does it to cross references: image The UI would be with the data coming through useDocById (title+description and newly in future custom_props).

frontMatter.title is only the title declared through frontmatter.

I think we can ignore this inconvenience in case we follow the proposed custom_props approach. Just in case you're interested into my current problem, here some insights πŸ˜‰ : In my custom React component used across pages I need to check whether I need to access .label or .title depending on whether the data I pass in comes from useCurrentSidebarCategory (category page) or useDocById/metadata (other pages).

In future I will also use useDocById on the category index page to access the custom_props of the children eliminating this problem.

We have new frontmatter to control pagination so you should be able to achieve what you need?

Technically yes. But it comes at the cost that for each of my >100 pages I need to define the pagination_next and pagination_prev individually instead of having maybe just a base rule I define once (maybe in the sidebar.js?)

I was thinking of a proposal like this for the autogenerated sidebars:

type SidebarItemAutogenerated = {
  type: 'autogenerated';
  dirName: string; // Source folder to generate the sidebar slice from (relative to docs)
  pagination: 'default' | 'disabled' | 'ignore-generated'; // default: like today, disabled: no pagination for any of the items (unless overridden on page), ignore-generated: all generated items will be excluded from the pagination by default
};

This way the my hundred individual methods are all skipped, showing by default the next sibling of the category index page.

Danielku15 commented 2 years ago

Good news, I migrated my whole docs now to the latest Docusaurus. The sidebar_custom_props combined with useCurrentSidebarCategory allowed me to generate the tables like I had them before. An example page can be seen here: https://alphatab.net/docs/reference/settings

With this I would say my primary usecase I need to maintain my current docs is supported and I was able to maybe work around all the things I didn't like out of the box (hiding sidebar items with css) or accepted them as they are because it might not really matter (e.g. pagination). If new things pop-up I will report them individually. This issue already became a bit too big πŸ˜…

Therefore only my future usecase is remaining: "GitHub alike cross reference tooltips and inline hints by using the frontmatter of a known page". For this I would love to see the proposed feature of having a custom_props which can be accessed through useDocById.

@slorber Shall we keep this issue alive for the second usecase or shall we maybe rather close this one and open a fresh one? The discussion here already evolved so much that I am worried that there is too much irrelevant information which might confuse people.

slorber commented 2 years ago

@Danielku15 yes that's probably worth closing this issue. Other readers might struggle a bit when reading this issue πŸ˜…

I opened a proposal here, please let me know what you think of it: https://github.com/facebook/docusaurus/issues/6923

wei-harness commented 1 year ago

Thanks for the solution @slorber , but when I tried to embed a category index page in a markdown page, I ran into an error said: Unexpected: cant find current sidebar in context

I tried both:

import DocCardList from '@theme/DocCardList';

<DocCardList />

and

import DocCardList from '@theme/DocCardList';
import {useCurrentSidebarCategory} from '@docusaurus/theme-common';
<DocCardList items={useCurrentSidebarCategory().items}/>

Docusaurus version: 2.4.0

Screen Shot 2023-04-13 at 7 21 20 AM
slorber commented 1 year ago

@wei-harness there is no concept of "current sidebar category" on a markdown page. A markdown page is not related to docs, you have to pass it a hardcoded list of things to display otherwise it's impossible to guess your intent.

import DocCardList from '@theme/DocCardList';

<DocCardList items={[/* hardcoded list here */]}/>

Note this usage is undocumented and considered as an impl detail, not a public API. Only the usage of <DocCardList /> without props, in a docs context (not page context) is documented: https://docusaurus.io/docs/sidebar/items#embedding-generated-index-in-doc-page

If you decide to use it, you are on your own and we don't provide support for this.

You can as well create your own React component to achieve your need.