Closed aturon closed 4 years ago
If we want to support adding
#[track_caller]
to trait methods,
This refers to impl
of a trait
, not the declarations within the trait
, right?
IMO the easiest way to implement this involves no procedural macros or MIR passes. We'd just pass extra arguments on direct calls, and require a shim to get function pointers of the function. That would also surely work with trait methods, since it'd be post-monomorphization.
To explain, the shim (which we may already have some variation of) would "just" do the direct call to the function, which would pass the location of the shim, the same as the original function.
This refers to impl of a trait, not the declarations within the trait, right?
I can't remember. =) But I suspect so. I don't think there's anything special about "trait methods" per se -- it's really about dynamic dispatch.
@kennytm had a prototype implementation:
https://github.com/kennytm/rust/tree/caller-info-4
Maybe @kennytm you can summarize the approach you took? Do you think you'll have time to rebase etc? I'd like ideally to get @eddyb to buy in to the overall approach. =)
The prototype implementation works like this (note that #[track_caller]
was called #[implicit_caller_location]
there):
First there will be an AST transformation pass (implemented as an attribute proc-macro in src/libsyntax_ext/implicit_caller_location.rs
) which expands:
#[track_caller]
#[...attributes...]
fn foo(...) -> T {
/* code... */
}
into
#[rustc_track_caller]
#[inline]
#[...attributes...]
fn foo(...) -> T {
let __closure = |__location| { /* code... */ };
FnOnce::call_once(__closure, intrinsics::caller_location())
}
The purpose of this pass is to conveniently make up a new DefId
.
Create a new MIR inlining pass (main implementation in src/librustc_mir/transform/inline.rs
). This pass does two things:
foo()
where the attributes of the callee foo
contains #[rustc_track_caller]
.If intrinsics::caller_location()
is called :-
__closure
, replace it by the argument __location
. This is to propagate the caller location.Define the caller_location()
intrinsic.
fn caller_location() -> core::panic::Location<'static>;
Because Location
now needs to be known by the compiler, make the Location struct a lang-item.
Update the standard library to use track-caller features:
panic!
to use caller_location()
(or its safe counterpart, Location::caller()
)#[track_caller]
to unwrap()
and expect()
etc.Implement the -Z location-detail
flag (search for location_detail
) so that the filename/line/column can be redacted.
I think a syntactical transformations is unnecessary because we can instead change the "direct call ABI" (vs reifying to a fn pointer, including trait vtables via miri, which could go through a MIR shim). MIR inlining should also not be needed.
@eddyb So you are suggesting to change the extern "Rust"
ABI to always pass an extra argument?
@kennytm Only for functions declared with the attribute and called through static dispatch, everything else would use a shim when reifying (which I think we do already in some other cases).
@eddyb Maybe you could write down some mentoring instructions i.e. which files to look at 😊
@eddyb do it! do it! I want this feature.
@eddyb How are the mentoring instructions coming along?
Oh, I don't recall exactly what happened here.
For making static calls do something different, src/librustc_codegen_ssa/mir/block.rs
is the place you need to change, while for shims, src/librustc_mir/shim.rs
is where their MIR is computed.
But the rest of the pieces, I don't know off the top of my head where they happen.
fn
pointers and virtual calls.We could start by disallowing reification/virtualization of such functions, and only implement the static dispatch ABI changes.
We could start by disallowing reification/virtualization of such functions, and only implement the static dispatch ABI changes.
I don't expect this attribute to be usable on "real" functions without us implementing reification.
But for implementing this, I think one good strategy would be:
For now, I don't think there is sufficient reason to support implicit caller location on trait methods, and supporting that is fairly complicated. So as @aturon said on the head PR, just don't do it.
You should make sure that the #[blame_caller]
attribute "works". This means that:
#[blame_caller(42)]
),#[naked]
or extern "ABI"
together with #[rustc_implicit_caller_location]
should raise an error.I'm not particularly sure what's the exact way to do this, but you can maybe find some PR that implemented an attribute for that, or ask on Discord.
Make sure to add tests.
At this stage, I won't even add a parameter, just have the call-sites go through different paths.
So the virtual-call-only shim at #54183 is a good model for how things need to be done. You'll need to add a ReifyShim
that is similar to VirtualShim
in behavior (see that PR). You'll probably need to split ty::Instance::resolve
to ty::Instance::resolve_direct
and ty::Instance::resolve_indirect
(in addition to the already-existing ty::Instance::resolve_vtable
) and adjust the call-sites. Have resolve_indirect
return the shim if it is needed (i.e., when calling a #[blame_caller]
function).
#[blame_caller]
MIR construction is in rustc_mir::build. There's already code there handling implicit arguments, which you could work with.
You'll need to fix ty::Instance::fn_sig
to also contain the new type.
Change the shim you added in Step 1 to pass an "empty" location parameter - some "dummy" that represents a call that uses a shim. See the other shims for how to generate things in shims. Ask @eddyb if you are fighting const eval.
Find all the calls to ty::Instance::resolve_direct
you added previously, and make them pass the "correct" location parameter to the function. These callsites can poobably already modify parameters because of InstanceDef::Virtual
, you should be able to just follow that.
Make sure that when #[track_caller]
calls are nested, you pass the location of your call-site, rather than your location. Add a test for that. Also add a test that when one of the calls in the middle is not #[track_caller]
, the call-site is not passed.
You should probably add a function on ty::Instance
that tells you whether you need to add the caller location.
intrinsic::caller_location
See some other intrinsic for how to type-check it. Make sure it can't be called from functions that aren't #[trace_caller]
!
I think it might even be the best to do the wiring in MIR construction, like the move_val_init
intrinsic - that way you wouldn't even have to convince MIR optimizations not to dead-code-eliminate your new parameter or something evil like that, and to make it not be affected by "plain" MIR inlining you'll only have to change the ty::Instance::resolve_direct
callsite there (you did make MIR inlining use resolve_direct
, did you?).
Now you only need to implement the library support.
Is anyone working on this?
If nobody is working on this I would like to try to implement it.
Do you think that it is feasible as a first contribution to the compiler?
Are the steps in https://github.com/rust-lang/rust/issues/47809#issuecomment-443538059 still valid?
Yes, those steps still sound reasonable. This would be a difficult first contribution, but if you ask questions on https://rust-lang.zulipchat.com/# and ping @eddyb and @arielb1 I'm sure you'd be able to start making progress, and there are likely others who could help out.
I'm sorry for the delay. I couldn't have time to work on this until now =\
I have completed the step 0 described in https://github.com/rust-lang/rust/issues/47809#issuecomment-443538059 . The progress is available in my branch, though I guess that I will have to wait until the pull-request is open to know is everything is OK.
About error codes: if I understand correctly, every error emitted by the compiler has its unique code, so I have to register a new E0###
for every error. I assume that the final code is assigned only when patch is ready to be merged into master, so I'm using dummy codes (E0900
, E0901
, ...) for the new errors.
I'm interested in helping with this, are you still working on it @ayosec? If so, is there any way for me to contribute?
are you still working on it @ayosec?
Yes. I'm sorry for the lack of updates. During August I had almost no time, and I'm trying to start to continue now. At this moment I'm working with the shim to add the location parameter.
If so, is there any way for me to contribute?
Right now I spend most of the time reading and understanding how everything works, so there is not so much code to write.
Something that would be very helpful is writing the documentation for the new errors (like this or this, at the moment), since I'm not very fluent in English.
Ping @ayosec do you have time to address this or is it OK with you if someone else takes over?
I'm still working on it very slowly, but feel free to take it if you want. I can try to implement another issue once I get more time.
I'm eager to pick this up, opened a PR starting from your branch.
@Centril recommended splitting this into a few PRs:
add a location parameter to MIR of
#[track_caller]
inrustc_mir::build
(reference existing implicit arguments work if needed)
A good example of this that landed recently is the VaList
you get for something like this:
extern "C" fn foo(x: T, args: ...) {}
There it's c_variadic
in the signature, instead of an attribute, but it's similar in that HIR and MIR bodies have one more input than the signature and a few places have to account for it (see #63492).
(replacing this todo list with a db paper link to the implementation notes I'm keeping: https://paper.dropbox.com/doc/rfc-2091-impl-notes--Am~iTmS_9UmQVUK63kwMYAQDAQ-rwCdRc2fi2yvRZ7syGZ9q)
If I understand stability in std correctly, we can't migrate panic! to use Location::caller() until the latter has stabilized. Is this correct?
No, it's possible to use unstable functionality from inside std (e.g. inside the panic!
macro).
Is it different for macro expansions? I would naively assume so.
No-- let's follow up on this zulip thread.
Awesome, thanks for clarifying! I edited the library support todo list appropriately.
I have a PR ready for review implementing core::intrinsics::caller_location
and preparing the panicking infrastructure to use its return value. This lays the groundwork for propagating caller location as a single pointer-sized implicit variable through #[track_caller]
functions, which is up next.
I'm using this paper doc for tracking.
The PR I linked above has been approved and is towards the top of the homu queue (fingers crossed!). I've opened a PR following that up with the start of a real implementation for the #[track_caller]
attribute. Right now it only includes a functional const implementation and failing tests for a not-yet-extant codegen implementation.
Awesome work.
Could we also add a method like Location::current()
that always returns the location of the code that called Location::current()
, even if it happens to be within a #[track_caller]
function?
That way, I think e.g. line!()
could be replaced with a real macro that expands to Location::current().line()
.
Admittedly, although it sounds straightforward, it seems like it might be awkward to implement, given the way we wrap intrinsics in wrapper functions.
It's probably far too heavy for line!()
, but I think (|| Location::caller())()
would work even in a #[track_caller]
function.
The PR implementing the attribute is on its way to land. I'm working on a follow-up to lay the groundwork for annotating functions in std, hopefully I can land that before the next beta (and thus bootstrap) branch cut. It would be nice to be able to complete some of the std additions during the next release cycle.
Update: the PR I linked above has landed! I'll pull tonight's nightly in the morning and try it out on my out of tree test cases :D.
I have another PR open which I think will be the last one before we can start annotating the standard library. Ideally we can get it in before the next beta branch cut, as that will also allow us to un-cfg all of the existing non-bootstrap-compatible code in core and std.
I don't think we need to get anything into beta other than the compiler changes (which all landed other than that miri change you mentioned in a comment).
That is, once beta is promoted, anything that is cfg(not(bootstrap))
on beta can/needs to become the only variant on master, because cfg(bootstrap)
on master is a fully bootstrapped and released beta.
Nominating for lang team meeting to update about the awesome progress here, that's all.
The PR to have panic!
use the new intrinsic always has landed! I'm working on a branch now to annotate these functions in std:
The RFC also listed several implementations of Index
and IndexMut
, but it wasn't approved because of implementation limitations:
This RFC only advocates adding the #[track_caller] attribute to the unwrap and expect functions. The index and index_mut functions should also have it if possible, but this is currently postponed as it is not investigated yet how to insert the transformation after monomorphization.
Thing is, the current implementation is significantly different from the approved RFC's proposal and actually is compatible with trait methods to my knowledge (it is after monomorphization IIUC).
I'm interested in adding the attribute to the indexing functions, should I submit an "amended RFC" describing the new implementation and seek consensus on applying the attribute to trait methods? I've already thought about doing that since the implementation has deviated so much from the original proposal.
Have that PR open with the panic messages working!!!
About the code bloat drawback in the RFC, would setting RUST_BACKTRACE=1
make it
unnoticeable?
About the code bloat drawback in the RFC, would setting
RUST_BACKTRACE=1
make it unnoticeable?
I don't think so, because RUST_BACKTRACE
has effect at runtime, not compile time IIUC.
Alright, https://github.com/rust-lang/rust/pull/67887 has landed! Going to manually verify everything is working with tomorrow's nightly.
@nikomatsakis what do you think should be the process to form a plan for #[track_caller]
in trait impls? Mentioned above. I believe it's the only remaining significant question that would block stabilization after a couple of release cycles.
@rustbot modify labels to +I-Nominated
Nominating for lang-team discussion on @anp's final question. Does anyone object to extending #[track_caller]
to work in trait impls. Apparently the new discussion strategy (which I am still investigating) can handle this just fine.
My opinion: we should definitely do it! But I would like to see the new impl strategy written up for the rustc-guide, as I'd like to know how it works. (I'm not sure how much of that belongs in the reference, too? At least some of it.)
Re: a write-up, I planned to write about the attribute for the reference and to write some info about the implicit arguments for the rustc-guide. I had considered these important for stabilization and am going to do them regardless. If that's all that's required to finish the last bit of proposed work from the RFC, I'm excited!
(edit: the book -> rustc-guide)
@anp
write some info about the implicit arguments for the book.
what is "the book" in that sentence?
One thing I would really encourage is that you document the high-level picture of the impl for rustc-guide.
We discussed this in our lang team meeting today. We approve of extending this to trait items. This was based on an understanding that it falls out fairly natural from the implementation -- I gave a summary of what I understand based on skimming the PRs plus earlier conversations with @eddyb.
Ah yes, i meant rustc-guide when I said ‘the book’ - I’ve been referring to it much more than TRPL and forgot myself 🤣.
This example has regressed (including on stable by now): (playground)
fn main() {
std::collections::VecDeque::<String>::with_capacity(!0);
}
The location reported is now <::core::macros::panic macros>:3:10
.
(the choice of VecDeque::with_capacity
is arbitrary, it just happens to have an assert!
inside, and it will be codegen'd in the user crate because it's generic)
In general, we don't track enough information to recover the invocation site, when we have to codegen the MIR cross-crate. @Aaron1011 made me realize this is a problem in https://github.com/rust-lang/rust/pull/68965#issuecomment-583781140.
A cheap hack we could do is replace the macro span with the invocation one, when serializing Span
s inside MIR (or Span
s in general?).
cc @petrochenkov
Status update:
I'll be keeping those PRs warm and adding to the rustc-guide section for now. Modulo new bugs, I'm hoping that we'll be able to let this bake for a couple of releases and then the next active task will be a stabilization report. Am I missing anything?
the bug @eddyb mentioned in https://github.com/rust-lang/rust/issues/47809#issuecomment-583784847 has a proposed fix in #68965
No, it was brought up on #68965, but the fix is to track hygiene info cross-crate, which last I heard @Aaron1011 was working on.
I have some code locally which serializes hygiene information, but it's currently blocked on https://github.com/rust-lang/rust/pull/68941
https://github.com/rust-lang/rust/issues/47809#issuecomment-583784847
[triagebot] I don't know anything about the treatment of spans in MIR inlining.
This is a tracking issue for the RFC "Implicit caller location" (rust-lang/rfcs#2091).
Steps:
Unresolved questions:
~~If we want to support adding
#[track_caller]
to trait methods, the redirection pass/query/whatever should be placed after monomorphization, not before. Currently the RFC simply prohibits applying#[track_caller]
to trait methods as a future-proofing measure.~~Diverging functions should be supported.~~The closure
foo::{{closure}}
should inherit most attributes applied to the functionfoo
, in particular#[inline]
,#[cold]
,#[naked]
and also the ABI. Currently a procedural macro won't see any of these, nor would there be anyway to apply these attributes to a closure. Therefore,#[rustc_implicit_caller_location]
currently will reject#[naked]
and ABI, and leaving#[inline]
and#[cold]
mean no-op. There is no semantic reason why these cannot be used though.~~