Closed jyn514 closed 1 year ago
I'm not sure what do you mean by freezing the resolver outputs, but I meant the regular rustc's mode of operation
Rc
to_resolver_outputs
always results in calling Resolver::into_outputs
rather than Resolver::clone_outputs
Resolver::into_outputs
consumes the resolver with ResolverOutputs
being the only thing left from it, i.e. it's a "frozen" summary of the resolver's work at a certain pointSo we need to achieve "freezing" the resolver outputs in rustdoc rather than to avoid it.
Regarding eddyb's comment.
Yes, rustdoc resolves things after creating tcx (which needs ResolverOutputs
to be created), so we have to keep the resolver.
This needs to be avoided by moving all name resolution activity in rustdoc to an earlier point.
Would it be possible for lower_to_hir to stop stealing the resolver? Then rustdoc wouldn't need to clone it in the first place, it could just call queries.expansion().peek().1 whenever it needs access to the resolver.
Not sure about the details of rustc_interface's work, but yes, resolver should be unique and should not be cloned.
So we need to achieve "freezing" the resolver outputs in rustdoc rather than to avoid it.
Ok, I see, I updated the title.
Yes, rustdoc resolves things after creating tcx (which needs ResolverOutputs to be created), so we have to keep the resolver. This needs to be avoided by moving all name resolution activity in rustdoc to an earlier point.
I'm not quite sure I follow - is there a reason the tcx has to freeze the resolver into ResolverOutputs? Or could the tcx store the resolver itself? If it stored the resolver rustdoc could stop cloning it without major changes.
To be clear, I don't yet know how much work it would be for rustdoc to move all resolution before creating a TyCtxt, but the model where TyCtxt stores a resolver makes more sense in my head; or at least, it's more flexible.
Or could the tcx store the resolver itself?
It may be possible.
Resolver contains a lot of early compilation stuff like NodeId
s that is out of place in tcx though.
Also it may be preferable to finish all the resolution activities as early as possible (at least those that may result in loading new crates).
I think we may end up with some scheme working similarly to this if incremental compilation and queries advance so much that we'll start skipping resolution of paths if they are not actually used, but that would probably be a very different resolver.
For now, I'd prefer to normalize to what rustc does and have a single scheme with resolver working early instead. Paths in doc comments are not fundamentally different from paths in e.g. expressions, and if we visit and resolve paths early in general, then paths in doc comments can be visited and resolved early as well. (If we migrate to something like "resolver in tcx" in the future, then rustc and rustdoc will also migrate to it together.)
Remove clean::Attributes::links
Done in https://github.com/rust-lang/rust/pull/83833 (although I need to clean it up before it can be merged).
Expand IntraDocCrateLoader (from #83738) to fill out that field by handling more things that are currently in collect_intra_doc_links. I don't think it will be able to handle anything more than resolve_path:
This isn't currently possible because resolving Self
goes through the TyCtxt to get the name instead of the resolver: https://github.com/rust-lang/rust/blob/2616ab1c57e2d69f989307389b27ee996ed82575/src/librustdoc/passes/collect_intra_doc_links.rs#L868-L885
Here is a fix (it uses a tcx currently, but the same API is available on CStore): https://github.com/rust-lang/rust/commit/5d5be028b318ed7e542bf238ebf2fc009a830a5a
Oh wait no I'm dumb, https://github.com/rust-lang/rust/commit/5d5be028b318ed7e542bf238ebf2fc009a830a5a only helps if the item isn't an impl block. @petrochenkov Is there a way to resolve the Self
type of an impl block? I tried resolver.resolve_str_path_error(DUMMY_SP, "Self", TypeNS, self_id)
, but that panics because it's not a module.
If not it's not a giant deal, I'll just have to figure out how to fix the FIXME on line 868.
Ugh, I have to figure out how to replicate this bit somehow: https://github.com/rust-lang/rust/blob/2616ab1c57e2d69f989307389b27ee996ed82575/src/librustdoc/passes/collect_intra_doc_links.rs#L1824-L1833 Maybe the early pass could do that? I think the late pass for type-relative paths will need to have a partial resolution anyway, so it shouldn't be too much more work.
Is there a way to resolve the
Self
type of an impl block?
rustc
keeps the "current impl" (or other item providing Self
) in the late resolution visitor walking AST (see fn with_self_rib_ns
).
The analogous visitor in rustdoc is IntraLinkCrateLoader
right now.
I'm not sure how rustdoc resolves Self
now, but it cannot do it through Resolver
because resolver doesn't provide any methods able to resolve it. So it's probably fine for rustdoc to just continue doing what it's doing now.
rustc keeps the "current impl" (or other item providing Self) in the late resolution visitor walking AST (see fn with_self_rib_ns).
Thanks, this is helpful!
I'm not sure how rustdoc resolves Self now, but it cannot do it through Resolver because resolver doesn't provide any methods able to resolve it. So it's probably fine for rustdoc to just continue doing what it's doing now.
Hmm, ok. It can't do precisely the same thing because it uses the tyctxt to look up the type - maybe it could look up the DefKind while visiting the parent item? I have a few different ideas that could work (but I will probably end up just going with "use the DefId directly instead of stringifying" since I think I'll need it anyway).
Would it be possible for lower_to_hir to stop stealing the resolver?
Note that in practice this requires resolution to be incremental, since otherwise the tyctxt will still be using the frozen ResolverOutputs
which has the same issues. See also https://github.com/rust-lang/compiler-team/issues/437.
https://github.com/rust-lang/rust/pull/88679 now collects traits in scope for doc links, which rustdoc is later using in its attempts to emulate type-relative path resolution.
The only remaining thing now is to use the early doc link resolution (src/librustdoc/passes/collect_intra_doc_links/early.rs
) to build a map (HashMap<(/*path*/ String, Namespace, /*parent module*/DefId), Res>
or similar) that will be used by all later rustdoc passes to retrieve resolutions instead of calling into resolver using resolve_str_path_error
.
The branch in progress fixing this issue is https://github.com/petrochenkov/rust/tree/doclink2.
The functional part is mostly done, one major missing part is emulating the Self
replacement in paths that is done by rustdoc, some other remaining special cases like this may also remain.
I'll start benchmarking after the functional changes are ready.
Self
is a difficult case due to the way in which rustdoc deals with it currently.
Another branch in progress (https://github.com/petrochenkov/rust/tree/doclinkself) attempts to avoid the textual replacement of Self::
that is currently done by rustdoc with resolving Self
to a type at type-checking time, without involving Resolver
at all.
So it's a pre-requisite for the doclink2 branch.
Rustdoc currently creates a copy of the resolver to use for intra-doc links: https://github.com/rust-lang/rust/blob/d474075a8f28ae9a410e95d849d009006db4b176/src/librustdoc/core.rs#L350-L354 This is a Terrible, Horrible, No Good, Very Bad Idea. In particular, it causes rustdoc's copy of the resolver and the TyCtxt to disagree about what crates exist:
It's also distorting rustc_resolve, since not all of the outputs make sense to clone in the first place: https://github.com/rust-lang/rust/pull/65625#discussion_r336780545
We should refactor rustdoc somehow to allow getting rid of
Resolver::clone_outputs
. @petrochenkov suggests moving anything that needs to touch the resolver before HIR lowering: https://github.com/rust-lang/rust/issues/68427#issuecomment-578493425.@petrochenkov what do you think about @eddyb's comment in https://github.com/rust-lang/rust/pull/65625#discussion_r336781320 ?
Would it be possible for
lower_to_hir
to stop stealing the resolver? https://github.com/rust-lang/rust/blob/d474075a8f28ae9a410e95d849d009006db4b176/compiler/rustc_interface/src/queries.rs#L227Then rustdoc wouldn't need to clone it in the first place, it could just call
queries.expansion().peek().1
whenever it needs access to the resolver.See https://github.com/rust-lang/rust/issues/68427 for previous discussion.
Implementation History
clean::Item
s before resolving intra-doc links: https://github.com/rust-lang/rust/pull/83833collect_intra_doc_links
: https://github.com/rust-lang/rust/pull/84101preprocess_link
to also take into account the hacks forSelf::
andcrate
: https://github.com/rust-lang/rust/blob/ef524710669a003cf2b24361dc6b04b566e560fa/src/librustdoc/passes/collect_intra_doc_links.rs#L880, https://github.com/rust-lang/rust/blob/ef524710669a003cf2b24361dc6b04b566e560fa/src/librustdoc/passes/collect_intra_doc_links.rs#L1124-L1153