ovflowd commented 1 year ago

FYI: This Description is Outdated! (Need update)

As discussed in our Collaborator Summit 2022 edition, we discussed a series of proposals within the current way we structure the metadata of our API docs. This proposal will eventually deprecate specific proposed changes here.

Within this issue, we will adhere to naming the proposal as an "API metadata proposal" for all further references.

The API Metadata Proposal

Proposal Demo: https://github.com/ovflowd/node-doc-proposal

Introduction

What is this proposal about? Our API docs currently face a few issues varying from maintainability and organization to the extra tooling required to make it work. These are namely the following:

The current infrastructure for doc generation is non-standard and not easy to contribute/update for newcomers as it does complex ASTs with unified. Making it harder to debug, update or change how things are done
We use a specifically crafted Remark Plugin (and ESLint config) to make some non-conforming rules work. Ultimately the ESLint plugin is neither ensuring that certain things are valid Markdown.
Our API docs use non-conforming Markdown, which is incompatible and not standard. As most of the Markdown parsers and linters are becoming stricter, eventually (and already for specific parsers such as MDX), it will fail. Namely, for example, our inline YAML snippets are also not validated. Hence, some have "invalid" YAML syntax.
We require our infrastructure to interpolate content from Markdown and guess what is being done. For example, to get the Stability Index, the Level of Heading, or if the section refers to a class or method.
Some Markdown files are way too big. This outright makes the build process complex, and some pages become massive for the Web, being unreasonable for metered internet connections.
- Not to mention that from a maintainability standpoint, this is unfeasible.
This proposal will also achieve better-generated doc metadata that can be used by projects such as TypeScript
This proposal will also allow Internationalisation to be done as the metadata is separated from the actual Markdown files.

There are many other issues within the current API docs, from non-standard conventions to ensure that rules are appropriately made, from maintaining those files to creating sustainable docs that are inclusive for newcomers and well detailed.

The Proposal

This proposal, at its core, boils down to 4 simple changes:

All the actual API structure/metadata gets extracted to dedicated YAML files
- Each YAML file has its corresponding Markdown file
- E.g., doc/api/modules/fs/promises.metadata.yml has doc/api/modules/fs/promises.en.content.md
The folder structure for API docs gets updated in a tree fashion for the modules
- Each class has its YAML and Markdown file
- TL;DR files are broken down into their minimal section (being a class)
Markdown file is responsible for:
- Descriptions
- Introductions
- Examples
- References
- Real-world usages

Re-structuring the existing file directory

In this proposal, the tree of files gets updated by adopting a node approach (pun intended) for how we structure the files of our API docs and how we name them.

Notably, these are the significant changes:

The nature of a file categorizes the top-level folders; for example, anything related to a Node.js module will reside within modules. Globals, will, for example, reside within globals
- There's no concrete list of all the possible-top level folders for now; for example, "About this documentation," "How to install Node.js," or another kind of general Documentation related to Node.js would probably not fit on any of these folders. A suggestion would be a misc folder, but this is open for debate as this is not a crucial point.
The second level of folders, in the case of modules, is the name of the module (top-level) import. For example, "File Systems" would be "fs" Resulting in doc/api/modules/fs
- Any other level of sub-directories would be a sub-namespace of the module. For example, node:fs/promises would be doc/api/modules/node/fs/promises.
- Finally, the last level would be the name of a Class e.g., doc/api/modules/node/fs/promises/file-handle.yaml, Whereas for the promises import itself, it would be doc/api/modules/node/fs/promises.yaml
- You will notice in the first case promises is a folder and in the second a YAML file; that's because we're following a Node approach, just like a Binary-Tree.

Accomplishing this change

This can be quickly done by an automated script that will break down files and generate files. Using a script for tree shaking and creating this node approach would, in the best scenarios, work for all the current files existing on our doc/api and, worst case scenario 98% of the files, based on the consistency of adoption and how modules are following these patterns.

Extracting the metadata

As mentioned before, the Markdown files should be clean from the actual Metadata, only containing the Description, Introduction (when needed), Examples (both for CJS and MJS) and more in-depth details of when this class/method should be used, and external references that might be useful.

Extracting the metadata allows our contributors and maintainers to focus on writing quality documentation and not get lost in the specificities of the metadata.

What happens with the extracted metadata?

It will be added to a dedicated YAML file containing all the metadata of a particular class, for example. (We created a new tooling infrastructure that would facilitate this on being done here.

The metadata structure will be explained in another section below.

The extraction and categorization process can be automated for all modules and classes, reducing (and erasing) the manual work needed to adopt this proposal.

Enforcing the Adoption of best practices

The actual content of the Markdown files will be "enforced" for Documentation reviewers and WGs for specific Node.js parts, possibly by the adoption of this PR.

The Metadata (YAML) schema

Similarly to the existing YAML schema, it would namely be structured as this:

name: 'api/modules/crypto/certificate'
source: "lib/crypto.js"
stability: stable
tags:
  - "certificates"
  - "digital certificates"
history:
  - type: added
    versions: [v0.11.8]
methods:
  - name: exportChallenge
    stability: deprecated
    static: true
    history:
      - type: added
        versions: [v9.0.0]
        pullRequest: "https://github.com/nodejs/node/pull/35093"
        details: "crypto.certificate.method.exportChallenge.history.[0].details"
    params:
      - name: spkac
        optional: false
        types:
          - String
          - ArrayBuffer
          - Buffer
          - TypedArray
          - DataView
      - name: encoding
        details: "crypto.certificate.method.exportChallenge.params.[1].details"
        optional: true
        types:
          - String
        defaults:
          - "UTF-8"
    returns:
      - type: Buffer
        details: "crypto.certificate.method.exportChallenge.returns.[0].details"
constants:
  - name: S_IXUSR
    import: "fs.constants.S_IXUSR"

The structure above allows easily to structure and organise the metadata of each method available within a Class and quickly describe the types, return types, parameters and history of a method, Class, or anything related.

I18n and ICU on YAML files

The structure is also I18N friendly, as precise text details that should not be defined within the Markdown file can be easily referenced using the ICU format. These details can be accessed on files that match the same level of a specific module. For the example above, for example, doc/api/modules/node/fs/promises.en.i18n.json contains entries that follow the ICU format such as:

{
  "fs.promises.tags": ["writing files", "creating files", "file systems"],
  "fs.promises.method.lchmod.returns.[0].details": "The lchmod method returns a Boolean when specific parameters are ....",
  ...
}

Specification Table

The table below demonstrates the entire length of the proposed YAML schema.

Note.: All the properties of type Enum will have their possible values discussed in the future, as this is just a high-level specification proposal.

Top Level Properties

Field	Optional	Type	Description
`name`	No	`String`	The Heading ID identifier for that module, should usually be the path of module on the `doc` folder.
`import`	No	`String`	The canonical import of the module (i.e. the string used to import this class/module). This will generate on CJS/MJS imports usages
`stability`	No	`Enum`	The Stability of a given module. It follows the widely adopted "Stability Index" from our existing docs.
`tags`	Yes	`Lang ID`	A translation ID for tags used to identify or help users to find this method with Search engines.
`history`	Yes	`Array<History>`	An array of history entries to decorate the notable historical changes of that module
`methods`	Yes	`Array<Method>`	The methods of that class/module
`constants`	Yes	`Array<Constant>`	If the Language is enabled and currently supported by the website. It should only be enabled if both the I18n team and Nodejs.dev team agrees that sufficient content for that page was translated.
`source`	Yes	`String`	The path to the source of that class/module

History

Field	Optional	Type	Description
`type`	No	`Enum`	The type of the change
`pullRequest`	Yes	`String`	An optional Pull Request for the related landed change
`issue`	Yes	`String`	An optional Issue link for the related landed change
`details`	Yes	`Lang ID`	A translation ID for extra short details of that change. Actual details should usually link to a PR or Issue
`versions`	Yes	`Array<String>`	An array containing the versions this change impacted initially
`when`	Yes	`String`	A date string following the ISO-8601 (https://en.wikipedia.org/wiki/ISO_8601)

Method

Field	Optional	Type	Description
`name`	No	`String`	The Heading ID identifier for the method. It should also reflect to the actual name that is imported
`stability`	No	`Enum`	The Stability of a given module. It follows the widely adopted "Stability Index" from our existing docs.
`tags`	Yes	`Lang ID`	A translation ID for tags used to identify or help users to find this method with Search engines
`history`	Yes	`Array<History>`	An array of history entries to decorate the notable historical changes of that method
`returns`	No	`Array<ReturnType\\|Enum>`	An array containing the return types of the method
`params`	Yes	`Array<MethodParam>`	An array containing the parameters of the method

MethodParam

Field	Optional	Type	Description
`name`	No	`String`	The name of the parameter of the method
`optional`	No	`Boolean`	If the parameter is optional or not
`defaults`	Yes	`Array<ParameterDefault>`	An array containing the default values of the Parameter
`types`	No	`Array<ParameterType\\|Enum>`	An array containing the types of the Parameter

ReturnType, ParameterType, ParameterDefault

Field	Optional	Type	Description
`details`	Yes	`Lang ID`	A Translation ID for the details of this return type
`type`	No	`Enum`	The type of the return type

Incorporating the Metadata within the Markdown files

As each Class has numerous methods (possibly constants) and more, the parser needs to know where to attach the data within the final generated result when, for example, building for the web.

This would be quickly done by using Markdown compatible Heading IDs

# File Systems {#api/modules/node/fs/promises}

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Quisque non tellus orci ac. Maecenas accumsan lacus vel facilisis volutpat est velit egestas. Placerat in egestas erat imperdiet sed euismod. Egestas maecenas pharetra convallis posuere morbi leo urna molestie at. Ultricies mi eget mauris pharetra et ultrices neque ornare aenean. Sodales ut etiam sit amet nisl purus in. Nunc pulvinar sapien et ligula ullamcorper malesuada. Pulvinar neque laoreet suspendisse interdum. Lectus proin nibh nisl condimentum id. Habitant morbi tristique senectus et netus et malesuada fames ac. Nulla porttitor massa id neque aliquam vestibulum morbi.

## Method: LCHMOD {#lchmod}

Curabitur gravida arcu ac tortor dignissim convallis. Urna id volutpat lacus laoreet non curabitur. Sem integer vitae justo eget. Amet purus gravida quis blandit. Posuere urna nec tincidunt praesent semper feugiat nibh sed pulvinar. Nunc eget lorem dolor sed viverra ipsum nunc. Dignissim cras tincidunt lobortis feugiat. Maecenas pharetra convallis posuere morbi leo. Volutpat lacus laoreet non curabitur gravida arcu. Leo a diam sollicitudin tempor id.

....

The parser would map the Heading IDs to each YAML entry's name fields to the associated Heading ID. Allowing you to write the Heading as you wish by still keeping the Heading ID intact.

Naming for Markdown files

To ensure that we have a 1:1 mapping between YAML and Markdown, the Markdown files should reside in the same folder as the YAML ones and have the same name, the only difference being the Markdown files have the .md extension in lowercase. They're suffixed by their languages e.g. .en.md.

Note.: By default, the Markdown files will default to .en.md extension.

The Build Process

Generating the final result in a tangible readable format for Humans and IDE's is no easy feat.

The new tooling build process would consist of two different outputs:

Generating JSON files from the YAML metadata.
- These are namely used for JSDocs or IDE scanning/IntelliSense, such as TypeScript (cc @nodejs/typescript)
Generating MDX Buffers that our Websites can use
- MDX is a JSX-in-Markdown format that allows us to insert Reactive-Components within our Codebase
- The idea here is, during the build process, to generate a Buffer that is the combination of the plain Markdown + React Components that are used to render the Metadata.
- This is more tooling required for the end-users of the documentation and is also helpful in previewing the documentation. This must be discussed on a separate Issue to address topics such as:
  - Where should the tooling reside
  - How to generate documentation previews just containing the documentation (not the whole website) and also allow generating docs only of what you changed (e.g., generating previews of a specific file)
  - How would be the categorization of the files
  - How would the links for the files and redirects from the old API schema to the new one

Example of the file structure

An essential factor in easing the visualization of how this proposal would change the current folder structure is to show an example of how it would look with all the changes applied. The snippet below is an illustration of how it would look.

Note.: The root directory below would be doc/api.

├── api
│   ├── en.navigation.md
│   ├── documentation.en.content.mdx
│   ├── modules
│   │   ├── en.navigation.md
│   │   ├── fs
│   │   │   ├── en.navigation.md
│   │   │   ├── index.metadata.yml
│   │   │   ├── index.en.content.md
│   │   │   ├── promises.metadata.yml
│   │   │   ├── promises.en.content.md
│   │   │   └── ...
│   │   ├── streams
│   │   ├── crypto
│   │   │   ├── en.navigation.md
│   │   │   ├── webcyrpto.metadata.yml
│   │   │   ├── webcyrpto.en.i18n.json
│   │   │   └── webcrypto.en.content.md
│   │   └── ...
│   ├── globals
│   ├── others
│   ├── packages.en.content.md
│   └── ...
└── ...

The Navigation (Markdown) Schema

Navigating through the API docs is as essential as displaying the content correctly. The idea here is to allow each module to define its Navigation entries and then generate the whole Navigation by aggregating all the navigation files.

Book of Rules for the Navigation System

The Navigation file is made in Markdown and has a reserved name (navigation.md)
A navigation file can be on any sub-level of any directory
Navigation files are not imported automatically
The build-tools specify the main Navigation file (e.g.: build-docs --navigation-entry=doc/api/v18/navigation.md)
The order of items is respected as-is
Each Item can be either a:
- Heading without a link
- Heading referring to an entry (YAML file)
- Heading referring to another Navigation file (To import the entries there)
Cool part is that Navigation items can be anything you want, not limited to something generated.

Note.: The Navigation source would be on Markdown, using a Markdown List format with a maximum of X-indentation levels.

The Schema of Navigation

The code snippet below shows all examples of the Schema and how it would be generated in the end.

File: doc/api/v18/en.navigation.md

* [About this Documentation](documentation.en.content.md)
* [Modules](modules/en.navigation.md)
* Some Header
  * Sub-Levels Supported
    * To a certain-max-level
    * [An External Link](https://nodejs.org/certification)

File: doc/api/v18/modules/en.navigation.md

* [File System](fs/en.navigation.md)
* [Streams](streams/en.navigation.md)

File: doc/api/v18/modules/fs/en.navigation.md

* [About File Systems](fs.en.content.md)
* [File System Promises](promises.en.content.md)
* ....

Example output in Markdown

* [About this Documentation](documentation.en.content.md)
* Modules
  * File System
    * [About File Systems](fs.en.content.md)
    * [File System Promises](promises.en.content.md)
  * Streams
    * ....
* Some Header
  * Sub-Levels Supported
    * To a certain-max-level
    * [An External Link](https://nodejs.org/certification)

It is essential to mention that the final output of the Navigation would be Markdown and can be used by the build tools to either generate an output on MDX or plain HTML or JSON.

Conclusion

As explained before, the proposal has several benefits and would significantly add to our Codebase. Knowing that the benefits vary from tooling, build process, maintainability, adoption, ease of documentation, translations, and even more, this proposal is faded to succeed! Also, all the items explained here can be automated, ensuring a smooth transition process.

ovflowd commented 1 year ago

cc @mhdawson @nodejs/next-10 @Trott

Trott commented 1 year ago

@nodejs/tsc @nodejs/documentation

Trott commented 1 year ago

We'll need to make sure this process doesn't add any work for releasers. (I don't think it would, but writing it here just in case.)

This will also be a good opportunity hopefully to fix our version picker quirks, at least for future versions of Node.js.

Trott commented 1 year ago

I like this a lot, although of course we'll see what kinds of unforeseen practical problems (if any) arise in the course of implementation.

I wonder if 20.x and forward is more realistic than 18.x and forward. I wouldn't complain if we got this working sooner than 20.x though.

Can we try to determine which parts of this can be done incrementally and which need to happen all-at-once? I'm trying to understand how many steps are involved here. (And if it's one big step, that's OK, but of course we'll want to automate everything because keeping the docs in synch with the current version will be an annoying problem otherwise.)

Trott commented 1 year ago

Is the idea that this would work on the current nodejs.org as well as on nodejs.dev or is the vision here that the nodejs.dev tech/build stack replaces what's on nodejs.org and that's a prerequisite for this to work?

ovflowd commented 1 year ago

Is the idea that this would work on the current nodejs.org as well as on nodejs.dev or is the vision here that the nodejs.dev tech/build stack replaces what's on nodejs.org and that's a prerequisite for this to work?

In theory, it could also work on nodejs.org, as if we enter the topic of "The build process" if we outsource the tooling created on the nodejs.dev repo (which should be pretty much independent of whatever static-framework stuff you use). Yes. A few tweaks would be needed, but in the end, we could reuse the HTML generation part of the existing nodejs/node/tool/doc.

For nodejs.dev no extra steps are needed, yet, I would like to outsource the tooling.

ovflowd commented 1 year ago

Can we try to determine which parts of this can be done incrementally and which need to happen all-at-once? I'm trying to understand how many steps are involved here.

I foresee 4 major steps:

Reach a consensus on the properties of the YAML (the schema)
Reach a consensus on the tooling and where it should reside
Outsource and update the tooling to extract the stuff correctly (this would be an one time change)
- This would generate the YAML and MD files with the correct directory tree, it would require some changes to the tooling we made on nodejs.dev, but it's not a complicate change.
Update the final tooling to also generate HTML and JSON.

That's it. Basically the migration itself can be mass done safely.

ovflowd commented 1 year ago

I wonder if 20.x and forward is more realistic than 18.x and forward. I wouldn't complain if we got this working sooner than 20.x though.

Indeed, I was trying to think about retroactively updating till v18, as v18 is the first version of the API docs that are the most Markdown conforming. (I'm referring to the v18 git tree, also on that tree seems like all the doc pages follow the current doc specs, at least for the metadata, hence why migrating at once would be seamless).

ovflowd commented 1 year ago

Proposal Updates

I'm going to update the main proposal adding the following missing sections

How Navigation would be structured and generated (The order of each item, their titles and stuff)
Example of a folder structure with all files

sheplu commented 1 year ago

Really great proposal ! A lot of topics are covered which is really great as this give a good overview of everything that will require some work. Good choice to not address all the subjects here as it would be too long, but good thinking mentioning them here (tooling, i18n...) which will allow easily link the PR

Just a few questions:

Versioning doc: keep all the versions accessible on the website ? How to easily update across multiple versions ? Doc on odd version or just even ? or all ?
Build process: include a way to generate the doc from source? To generate part of doc ? Generate whole doc ? Pdf too ? (Maybe better to discuss about that when we will talk about the tooling ?)
I am not against yaml but why not have directly the json and not the yaml ? is there some technical stuff blocking us from that ? or is it DX related ?
maybe on the tooling part, we should add / ensure full compliance of the doc ? Way to tests if the heading Id exists for example

Following @Trott comments I would agree that v20 would be the best time to have it. Will be short for the others version before that. But do we want to provide a retroactive doc for stuff before v20 ? if yes which version ? should we have all the LTS covered ?

A lot of question from my side :)

ovflowd commented 1 year ago

Build process: include a way to generate the doc from source? To generate part of doc ? Generate whole doc ? Pdf too ? (Maybe better to discuss about that when we will talk about the tooling ?)

As I mentioned before, the building tools will allow you to build just a subset of files if you want. I don't think HTML, PDF and JSON generation should be part of the core of the tooling, but could be added on top of it such as:

import docTooling ....

const result docTooling.generateDocs();

return myPdfLibrary...

We could add all kinds of output generation on top, but the core tooling is responsible for creating a JavaScript object tree with the "metadata" and content aggregated. Initially, the idea is to be a JSX Buffer (MDX), but we could also just return the result into a JavaScript object with the metadata and content. And then have a plugin that generates to MDX, as, for example, we would have for HTML, PDF, JSON...

E.g. (Of the object) for the promises module:

{
  "promises": {
    ... all the metadata fields,
   details: "the content from the Markdown file",
  }
}

Versioning doc: keep all the versions accessible on the website ? How to easily update across multiple versions ? Doc on odd version or just even ? or all ?

This is not a responsibility for this proposal.

I am not against yaml but why not have directly the json and not the yaml ? is there some technical stuff blocking us from that ? or is it DX related ?

YAML is more accessible to write than JSON and easier to read. Also less overhead on the transition period. JSON is just a JavaScript object, is not really human friendly (to a certain point) (IMHO)

maybe on the tooling part, we should add / ensure full compliance of the doc ? Way to tests if the heading Id exists for example

If it is not compliant, it wouldn't even build (give an error), but this should not be a responsibility of the tooling; it could be part of the build process by using tools such as Remark, and ESLint, for example.

aduh95 commented 1 year ago

YAML is more accessible to write than JSON and easier to read

I think that's debatable, YAML can be very hard for humans as well (e.g. multiline strings is non-intuitive, the type guessing makes it that sometimes one mistakes a string for a number, etc.). Other markup languages, such as e.g. TOML or JSON, do not have those problems. I'm not saying those are deal breakers for using YAML, or that we should not consider YAML for this use-case, but I think we should not disregard the problems of that syntax.

ovflowd commented 1 year ago

l (e.g. multiline strings is non-intuitive, the type guessing makes it that sometimes one mistakes a string for a number, etc.).

Gladly that none of those apply to our schema 😛

ovflowd commented 1 year ago

Other markup languages, such as e.g. TOML or JSON, do not have those problems. I'm not saying those are deal breakers for using YAML, or that we should not consider YAML for this use-case, but I think we should not disregard the syntax problems.

Every markup language has its pros-and-cons. I just personally (please take it with a grain of salt) belive that, in this case, the pros of using YAML are better.

mhdawson commented 1 year ago

Thanks for comprehensive proposal !

I think this

Example of a folder structure with all files

and will denfinitely help me understsand/consume what you are suggesting.

ovflowd commented 1 year ago

@mhdawson @Trott I just updated it :)

ovflowd commented 1 year ago

Friendly bump for @mhdawson @Trott so we can proceed with the next steps of this proposal :D

Trott commented 1 year ago

It seems like the "move the YAML to a separate file" part can happen pretty much at any time as long as someone is willing to update the relevant tooling. Would it be beneficial to do this right away so that there's one less structural change to make the rest of this proposal happen?

ovflowd commented 1 year ago

It seems like the "move the YAML to a separate file"

Hmm, the way how the YAML is structured right now in the Markdown, it would possibly have no benefits in extracting it. At least to a certain degree the proposed YAML structure needs to be implemented.

I also think I got tasked in making a demo repository with example contents 🤔

mhdawson commented 1 year ago

@ovflowd we had discussed an example of what the directory would looke like for a single API, is that what you meant about a demo repository with example contents ?

ovflowd commented 1 year ago

Yup, pretty much!

ovflowd commented 1 year ago

I had a meeting with @mhdawson, and here's the execution plan for this proposal:

Write a tool to convert the old doc format (the files from doc/api) to the proposal format here. This can pretty much be reused from here
- The tooling can be updated instead of outputting an MDX file to gather the data and perform all operations to output the metadata in YAML, split the Markdown files, and create the new folder structure.
Update the GitHub Actions Workflows to introduce a new linting step that always runs the new tooling in a "staging/dry-run" fashion but that breaks if anything is invalid. This is useful to enforce any new changes to the doc files to conform with the doc standards.
Introduce a new core tooling for parsing the new doc file format and sub-modules (plugins) to generate output in numerous formats such as:
- HTML (Plain HTML output to mimic current doc generation)
- JSON (As the current JSON format)
- MDX (For the new Website)
Sniff test to check if the generated HTML files, MDX files and JSON files work correctly and test if the tooling is working.
Switchover to the new docs format by making a big-bang PR (runs the converter) with all the file changes.

Original source: https://docs.google.com/document/d/1pRa7mqfW79Hc_gDoQCmjjVZ_q9dyc2i7spUzpZ1CW5k

ovflowd commented 1 year ago

@mhdawson I'm going to proceed with the demo (example) (mentioned here https://github.com/nodejs/next-10/issues/166#issuecomment-1322363051) very possibly during December.

sheplu commented 1 year ago

Following the discussion during the last next-10 meeting, it could be great to create another meeting / discussion channel and only keep the update during the next-10 meeting. This topic being really complexe and having a lot of impact it will take and "block" others globals topic. What do you think @ovflowd ? Also because you are leading this initiative when would be the best time for you ? (we can discuss it on slack it could be easier)

mhdawson commented 1 year ago

Once the demo is in place, I'll get a presentation to the TSC onto the TSC agenda, likely at a meeting in Jan.

ovflowd commented 1 year ago

@ovflowd ? Also because you are leading this initiative when would be the best time for you ? (we can discuss it on slack it could be easier)

Hmm, let's talk about this on the next Next-10 meeting so we can get in sync about this! :D

AugustinMauroy commented 1 year ago

Ok great, but I don't see what it brings compared to the docs on nodejs.dev? except more files to manage

ovflowd commented 1 year ago

Ok great, but I don't see what it brings compared to the docs on nodejs.dev? except more files to manage

I don't want to sound rude, but I think you lack the context behind this proposal 🤔

The API Docs you see on https://nodejs.dev are generated through a script that processes the source API Documentation files. This proposal aims to address several long-standing issues from those files that are the source of the documentation.

And to answer your question, yes, there are more files to manage. The pros-cons are all outlined on the proposal.

AugustinMauroy commented 1 year ago

What I meant was that if we wrote (on nodejs/node) like on nodejs.dev wouldn't it be easier?

And you're not rude at all

ovflowd commented 1 year ago

What I meant was that if we wrote (on nodejs/node) like on nodejs.dev wouldn't it be easier?

Nope, it wouldn't be easier at all. The current files on Nodejs.dev are "generated" ones. The meaning of generated being, that they're generated to be compatible with a technology we use called MDX. Think about them as "output of a build system". They're no improvement at all for the Developer Experience of the average contributor of Node.js

AugustinMauroy commented 1 year ago

I didn't see it as an mdx file. So I validate your idea!

Trott commented 1 year ago

Ok great, but I don't see what it brings compared to the docs on nodejs.dev?

Anything that requires core developers to have to go to a different repo to see what doc changes will look like is a dealbreaker. Anything that requires more work for core developers to validate documentation changes than they do right now is a dealbreaker.

So, if you're suggesting "move the nodejs.dev documentation generation process to core and then core devs can run make doc-only like they do now and see what the website will look like", then sure, that's a possibility.

But if you're suggesting that the website have a different process to generate docs than core, and that the docs on the website look different from core unless core devs take an additional step, that's not going to work.

ovflowd commented 1 year ago

@sheplu @mhdawson here's the repository containing an "example" of how the metadata proposal would look like https://github.com/ovflowd/node-doc-proposal

AugustinMauroy commented 1 year ago

@nodejs/crowdin-managers What do you think of this change, how will it impact crowdin?

ovflowd commented 1 year ago

@AugustinMauroy this has nothing to do with Crowdin...

AugustinMauroy commented 1 year ago

@ovflowd The question was to know if the structure modification will work with the Crowdin tool

ovflowd commented 1 year ago

I repeat myself, this has nothing to do with Crowdin.

Crowdin is not even used for Node.js API docs. And I don't see an easy way of implementing it, neither if we should for the time being. Also the Crowdin managers (the people you pinged) only manage the instance.

AugustinMauroy commented 1 year ago

For your information nodejs have an Crowdin for Api docs but the GitHub integration was broken.

ovflowd commented 1 year ago

For your information nodejs have an Crowdin for Api docs but the GitHub integration was broken.

We might have a "group" inside Crowdin, but API Docs were never integrated with Crowdin. I'm quite sure about that, but of course, I could be wrong. Still, this is off-topic @AugustinMauroy, pretty please, let's stay on-topic here.

richardlau commented 1 year ago

Has any thought been given as to how we handle the switchover/migration? In particular how this will affect porting stuff between main and any versions of Node.js on the new system and LTS/older versions of Node.js on the old one? For example, presently when we merge something into LTS the release commit from the LTS release is cherry-picked to main and that (generally) takes care of updating the "added in" metadata.

ovflowd commented 1 year ago

@richardlau it was written in one of the comments: https://github.com/nodejs/next-10/issues/166#issuecomment-1327867224

In particular how this will affect porting stuff between main and any versions of Node.js on the new system and LTS/older versions of Node.js on the old one?

As we spoke about, including on Next-10 meetings, the metadata proposal applies only for new versions of Node.js, not going to be ported to old versions of the docs (as this is pointless).

For example, presently when we merge something into LTS the release commit from the LTS release is cherry-picked to main and that (generally) takes care of updating the "added in" metadata.

The idea is to release this proposal on the next LTS version. I'm not sure I got exactly what you're asking here, so it would be nice if you could explain it better :)

richardlau commented 1 year ago

@richardlau it was written in one of the comments: #166 (comment)

As we spoke about, including on Next-10 meetings, the metadata proposal applies only for new versions of Node.js, not going to be ported to old versions of the docs (as this is pointless).

For example, presently when we merge something into LTS the release commit from the LTS release is cherry-picked to main and that (generally) takes care of updating the "added in" metadata.

The idea is to release this proposal on the next LTS version. I'm not sure I got exactly what you're asking here, so it would be nice if you could explain it better :)

@ovflowd I mean that we frequently port things between releases and the main branch. Maybe examples will make this clearer: e.g.

Backports, e.g. https://github.com/nodejs/node/pull/44976. This is taking commits from main and backporting them to older versions.
Forward port. e.g. https://github.com/nodejs/node/commit/a14244ce26cdab028482ca8ec6224cb839de9c75 is an example of a release commit for an LTS released being cherry-picked onto main to add the changelogs and update the doc metadata. I really want to minimise any additional work releasers have to do.

If the metadata is now in different formats between the branches being picked from and to, that's extra work to convert between the formats.

ovflowd commented 1 year ago

Backports, e.g. https://github.com/nodejs/node/pull/44976. This is taking commits from main and backporting them to older versions.

Well, in this case, the docs of the change on main when backported through cherry-pick will of course need a during-cherry-pick edit (like as when you do interative rebase).

It is the pain of transitioning from one standard to another and due to the docs being coupled to the commit of the change itself. I can imagine this will not be often, and as we move forward all the "backported" and "forward-ported" versions will use the new metadata proposal.

This is another reason why we want to release this together with a major semver, like v20. Yes if we need to backport or forwardport things to/from v18 we will need to edit the cherry-pick in-time, or possibly have a separate commit for the docs.

If the metadata is now in different formats between the branches being picked from and to, that's extra work to convert between the formats.

I agree, but this is a short term issue as far as I can see.

ruyadorno commented 1 year ago

@ovflowd I'm sorry but I'm afraid this might be a major blocker to the proposal (I apologize but I should have caught it during your presentation on Wednesday).

I agree, but this is a short term issue as far as I can see.

v18 would be the last LTS version containing the previous docs and it goes end of life on April 2025, granted that this proposal lands in time for v20. Unfortunately I don't believe this to be a short term issue.

I'm happy to take some time to chat about it, show how fundamental the backports are to the LTS release lines in our current release model and brainstorm ways to improve that migration story.

ovflowd commented 1 year ago

Hey @ruyadorno I don't think this will be an issue at all. The way I see it, is that to make backports easy and feasible the tooling that generates the docs from the old (current) API doc format to the new format (from this proposal) should be able to generate it back to the old format.

I'm thinking in something like this:

# generates from the old format to the new one making all the generated files to the out directory
node-api-tool -c node/doc/api/buffers.md -o out/ 
# generates from the new format back to the old format
node-api-tool -b -c out/module/buffers -o out_old_format/

It's just an example, but this could at least automate the backporting of the old doc format. Note that forward-porting is not an issue because the proposal initially already aims to have tooling for transforming the old format into the new one.

What do you think?

BethGriggs commented 1 year ago

I believe the workflow we need to preserve for backporting is the ability to git cherry-pick any commit that touches documentation on main back to prior release lines. We rely on automation and scripts that do this for us. If the proposal results in us hitting a conflict each time we backport a documentation change from main, and have to manually apply the diff to a different file in the tree, that would be a significant amount of added work for releasers. With potentially ~150 commits per current release, many would touch documentation (particularly the semver minors), so it's a lot of effort to manually apply those changes. And as @RuyAdorno mentions, that divergence would need to be handled until the EOL of Node.js 18.

(Sorry, my understanding is limited, but I believe changing the directory structure would impact our ability to git cherry-pick back from main cleanly.)

From the little I've dug in the proposal brings some great benefits (appreciate your efforts @ovflowd!).

Perhaps there's some Git magic/mapping or automation we can create to mitigate that in our tooling, but we'd need to prove it out and have it ready to go. An alternative may(?) be to manually backport/land the proposed new structure to all active release lines at time same time... but that would involve a lot of additional efforts and coordination.

ovflowd commented 1 year ago

Perhaps there's some Git magic/mapping or automation we can create to mitigate that in our tooling, but we'd need to prove it out and have it ready to go. An alternative may(?) be to manually backport/land the proposed new structure to all active release lines at time same time... but that would involve a lot of additional efforts and coordination.

Well, thanks for your insights! Really Appreciate it. Here are some ideas we can try to plan out:

I believe that migrating previous versions of Node.js API docs to the new format can be done, but it depends on how much back we want to go with backporting the changes. Afaik, v16 is the minimum version where all the API Markdown files consistently follow the current (old) format. v14 already has files not following the format, and things get messier the further we go back.
If we don't want to migrate older versions, we can still add the "cli" tool I mentioned to the backporting workflow. If we have a workflow that backports docs files, we can inside this (bash script? js script?) make it execute the CLI while doing an interactive cherry-picking, which means, of course, the original commit hash for the cherry-pick will differ, but it would require 0 manual work).

Let me know what you think :)

BethGriggs commented 1 year ago

I believe that migrating previous versions of Node.js API docs to the new format can be done, but it depends on how much back we want to go with backporting the changes. Afaik, v16 is the minimum version where all the API Markdown files consistently follow the current (old) format. v14 already has files not following the format, and things get messier the further we go back.

I was just thinking about the timelines, this may be a reasonable option. At the point when Node.js 20 is released the release lines may be in a state where it's managable to only backport the proposal to Node.js 18:

Node.js 19 - likely to have no more releases after April, EOL by June 2022.
Node.js 18 - still in active development, will have regular backports
Node.js 16 - maintenance, EOL in September 2022
Node.js 14 - likely to have no more releases, EOL in April 2022

Maintenance releases are typically very small (10-20 commits), so it might be a manageable amount of work to handle the divergence for Node.js 16 for the 5 months until it's EOL in September 2022. Perhaps backporting this proposal only as far back as Node.js 18 is a feasible option.

If we don't want to migrate older versions, we can still add the "cli" tool I mentioned to the backporting workflow. If we have a workflow that backports docs files, we can inside this (bash script? js script?) make it execute the CLI while doing an interactive cherry-picking, which means, of course, the original commit hash for the cherry-pick will differ, but it would require 0 manual work).

I think I'd need to think about this in more detail and maybe trial it out, but yeah, something like this may work so long as we can keep a handle on the individual/logical commits.

(cc: @nodejs/releasers, perhaps @targos has thoughts)

richardlau commented 1 year ago

I agree that we'll need a solution for Node.js 18, and could possibly manage without one for the remainder of Node.js 16 (assuming the change lands for Node.js 20).

targos commented 1 year ago

I think the best would be to backport the refactor to Node.js 18 (not necessarily at the same time as v20, but we should schedule a release for it). I agree that we don't have to care too much about v14 and v16.

nodejs / next-10

Metadata Proposal for Docs #166

FYI: This Description is Outdated! (Need update)

The API Metadata Proposal

Introduction

The Proposal

Re-structuring the existing file directory

Accomplishing this change

Extracting the metadata

What happens with the extracted metadata?

Enforcing the Adoption of best practices

The Metadata (YAML) schema

I18n and ICU on YAML files

Specification Table

Top Level Properties

History

Method

MethodParam

ReturnType, ParameterType, ParameterDefault

Incorporating the Metadata within the Markdown files

Naming for Markdown files

The Build Process

Example of the file structure

The Navigation (Markdown) Schema

Book of Rules for the Navigation System

The Schema of Navigation

Example output in Markdown

Conclusion

Proposal Updates