Rules to run rustdoc on multiple crates

UebelAndre commented 1 year ago

I would like to take multiple crates and build docs for all of them in a similar fashion to Cargo.

With Cargo, I'm able to take the following workspace with packages, foo, bar, and baz, where foo depends on bar and baz is an isolated library and generate all docs in one place.

rustdoc_workspace.zip

├── Cargo.lock
├── Cargo.toml
├── bar
│   ├── Cargo.toml
│   └── src
│       └── lib.rs
├── baz
│   ├── Cargo.toml
│   └── src
│       └── lib.rs
└── foo
    ├── Cargo.toml
    └── src
        ├── lib.rs
        └── main.rs

6 directories, 9 files

scentini commented 1 year ago

@P1n3appl3 I believe Fuchsia has interest in using rustdoc. Could you share some context here on whether what @UebelAndre is proposing works for you, or any other constrains that we might want to take into account when working on improving the documentation story?

P1n3appl3 commented 1 year ago

Yep we were also interested in a way to produce docs for the entire workspace rather than single per-crate docs which aren't very useful because links wouldn't work and you don't get an index.

Off the top of my head here are some things to take into account:

We'd expect rust_library targets to be included in the workspace wide docs by default without an explicit rust_doc(...) in their BUILD file, though there should be some way for crates to opt (and maybe testonly/binary crates should be opted out by default?). Like in @UebelAndre's example the crates shouldn't all have to be part of the same transitive set of dependencies. I'd assume that you can list multiple top-level crates would be included with all of their transitive deps, or maybe there's some way of discovering them in the workspace without that?
We need to be able to control the platform/target triple used to generate docs. It's unfortunate that the docs.rs dropdown menu for switching platforms isn't part of rustdoc itself, but I think emitting separate docs per platform is enough for us to host separate aarch64 and x86_64 docs for example.
--document-private-items should probably be configurable, though I don't think it's necessary on a per-crate basis.
This isn't super related, but it might make sense to use whatever mechanism is collecting the crates for a monolithic rustdoc run to also run doctests from those crates. Currently it's pretty high friction to write doc tests because you need to remember to write a rust_doc_test(...) definition in the BUILD file, but if generating workspace wide docs also ran doctests that'd be an easy way of picking them up automatically.

UebelAndre commented 1 year ago

https://github.com/bazelbuild/rules_rust/issues/152 could be related

konkers commented 1 year ago

An incremental improvement here could be to extend rust_doc to allow a list of crates to be provided. Transitive dependency crawling could be added later.

konkers commented 1 year ago

Here's a prototype of generating docs for multiple crates at once: https://github.com/konkers/rules_rust/tree/wip/rustdoc.

Since rustdoc must be invoked once per crate but with the same output directory, these invocations need to be wrapped into a single Bazel action otherwise Bazel will complain about multiple rules having the same output but different inputs. This is accomplished through two scripts:

dump_args.sh: Creates an executable shell script that runs the executable and arguments passed in through its args
run_all.sh: takes a list of executables and runs them in order.

dump_args.sh is run for every crate instead of the wrapped rustdoc. After all crates have been run through dump_args.sh, run_all.sh is run to invoke all the scripts created by dump_args.sh.

Thoughts on this approach? Specifically how do people feel about this approach of dumping the command lines and wrapping them in a single invocation later.

Additionally what are people's thoughts on how to expose this functionality? Two commands (one for a single crate and one for a set of crates)? Somehow combine them into a single rust_doc() with perhaps mutually exclusive crate and crates args?

scentini commented 1 year ago

If I read the code correctly, this would require listing every rust_library in rust_doc.crates. I think it might be better to investigate using an aspect to generate the docs and then to copy all the generated docs into a single directory in the rust_doc implementation. This would make it easier to document everything as it is only the final targets that need to be put into rust_doc.crates.

konkers commented 1 year ago

Perhaps I'm misunderstanding some details of aspects. I understand how aspects could be used to essentially generate rules for running rustdoc on each crate. These rules could then be run on through single bazel command line invocation using a target specifier like //...:all. I don't understand how then you would have a rule to depend on all those outputs without manually specifying them in you build files.

A larger problem, however, is that rustdoc expects to be invoked for each crate with the same output directory. There are files that it modifies in place (https://rustc-dev-guide.rust-lang.org/rustdoc.html#multiple-runs-same-output-directory). Bazel will, rightfully, not let you declare multiple targets with the same output and different inputs.

I could see an aspect approach being useful for documenting transitive dependencies of crates.

Am I missing something here?

matte1 commented 2 months ago

I'm also interested in this feature. Has any progress been made here? Happy to try to contribute if someone has a stepping off point.

konkers commented 2 months ago

I'm also interested in this feature. Has any progress been made here? Happy to try to contribute if someone has a stepping off point.

I spoke with scentini offline a while ago and they pointed out that my approach of bundling all the crate documentation into a single action has scalability issues with large source bases and distributed builds. I think the way forward here is for someone to engage with the rustdoc authors and propose and implement a mode where each crate can be documented separately then stitched together (cross referencing, search index creation, etc.) in one final, hopefully lightweight, step.

matte1 commented 2 months ago

I just rebased and played with this and it seems to work great.

I understand that this doesn't scale well and it should be implemented in a different manner long term, but also its completely opt in and it seems like it would give people the ability to bundle crate docs together in a CI step where it can afford to be a little more expensive or as a post-merge to main action that updates github pages.

EtomicBomb commented 2 weeks ago

propose and implement a mode where each crate can be documented separately then stitched together (cross referencing, search index creation, etc.) in one final, hopefully lightweight, step.

Hello! I'm from the Fuchsia team, and I'm looking at this from the rustdoc perspective! I'm working on a rustdoc RFC, I would be very happy if you could take a look and see if these additions to rustdoc would meet your needs!

The goal is to allow users to invoke rustdoc on crates in separate out directories, and merge the cross-crate information that rustdoc generates.

https://github.com/rust-lang/rfcs/pull/3662

bazelbuild / rules_rust

Rules to run rustdoc on multiple crates #1837