nim-lang / RFCs

A repository for your Nim proposals.
136 stars 26 forks source link

Split documentation generation for easier tooling and better jsondoc #447

Open PMunch opened 2 years ago

PMunch commented 2 years ago

This is something which have irked me ever since I tried jsondoc ages ago, and a situation I've tried to improve but which never seems to get up to full quality. What I'm proposing is simple, remove all the HTML and LaTeX generation from the Nim compiler, and make it only output JSON. Then have a separate tool to convert from JSON to HTML and one to convert from JSON to LaTeX.

Benefits of this system would be to ensure that the JSON output of the documentation generation always includes at least all the information required to build a site similar to the ones built by the HTML output. I've found myself time and time again trying to create my own documentation format for a specific project, only to find myself missing information to get it on par with the official documentation. This is frustrating because I know the compiler knows the information when it creates the JSON output, but throws it away instead of including it with the output.

Another benefit is that the compiler would be slightly simplified, since two of the documentation generation passes will be gone.

The way I imagine this to be implemented is fairly simple, first remove the JSON and LaTeX generation from the compiler. Then rewrite the HTML generation to output JSON instead. Now implement two separate programs that takes the generated JSON and converts it to HTML and LaTeX. The HTML generation should be pretty straight forward, it would probably only require the system to rewrite the JSON 1:1 since it was generated from the old HTML generation. Now that the parts are made it's time to sew it all together, I propose that the nim doc command should now take an optional flag --docgenerator or something similar which determines which program the JSON documentation should be fed to, it defaults to the HTML generation tool shipped with Nim. The doc2tex command works exactly the same way, but sets the --docgenerator to the LaTeX generating tool. Whether or not we implement a --docgenerator=json or --docgenerator=raw to just output the JSON data and let jsondoc set that default could also be considered.

This would make it much easier to generate documentation that fits in style with a project, or which feeds into an existing documentation system. As a bonus it would also simplify the compiler.

Araq commented 2 years ago

Sounds good. Probably a hell of a refactoring, but the idea is good.

PMunch commented 2 years ago

That's why I wanted to do the RFC first, wouldn't want to do all that refactoring only to have the PR turned down

PhilippMDoerner commented 9 months ago

I am in full fledged support for this (not that my opinion matters too much) to the degree that I'd be willing to contribute. I'd want to wrap up my current work on another project first (threadButler) but the fact I can't have "proper" docs for its other modules acts as enough of a motivator.

Step 1 would be to basically first think up a data-model for the JSON to output, right?

PMunch commented 9 months ago

More or less, yes. The current jsondoc is about 90% there, so it should serve as a very good starting point. The only reason why I propose removing the current jsondoc output and rewrite the HTML output in it's place is that figuring out how to convert the output from HTML to JSON is probably easier than rewriting the compiler path to get the correct information for the jsondoc target.

PhilippMDoerner commented 9 months ago

I guess my own main problem is that I have notorious difficulty seeing what is missing from the jsondocs. Some things I can see (e.g. it doesn't appear to split out doc-comments correctly), others I'm likely not aware of because my domain knowledge for this problem domain is basically very close to 0.

So I'm having trouble identifying what "should" be there and is missing.

PMunch commented 9 months ago

Yeah this is another reason why I'd want to convert the HTML docs to JSON instead of simply improving the current jsondoc. Basically create the JSON output and the JSON->HTML converter program in parallel by moving things out piece by piece. The biggest problem is dealing with things like code highlighting and the RST/Markdown hybrid Nim uses. I'm guessing the current output of this is HTML, so we'd either need to include it raw, or write a JSON output format for it instead of the HTML one.