rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.97k stars 12.68k forks source link

Tracking issue for RFC 2091: Implicit caller location #47809

Closed aturon closed 4 years ago

aturon commented 6 years ago

This is a tracking issue for the RFC "Implicit caller location" (rust-lang/rfcs#2091).

Steps:

Unresolved questions:

eddyb commented 6 years ago

If we want to support adding #[track_caller] to trait methods,

This refers to impl of a trait, not the declarations within the trait, right?

IMO the easiest way to implement this involves no procedural macros or MIR passes. We'd just pass extra arguments on direct calls, and require a shim to get function pointers of the function. That would also surely work with trait methods, since it'd be post-monomorphization.

To explain, the shim (which we may already have some variation of) would "just" do the direct call to the function, which would pass the location of the shim, the same as the original function.

nikomatsakis commented 6 years ago

This refers to impl of a trait, not the declarations within the trait, right?

I can't remember. =) But I suspect so. I don't think there's anything special about "trait methods" per se -- it's really about dynamic dispatch.

nikomatsakis commented 6 years ago

@kennytm had a prototype implementation:

https://github.com/kennytm/rust/tree/caller-info-4

Maybe @kennytm you can summarize the approach you took? Do you think you'll have time to rebase etc? I'd like ideally to get @eddyb to buy in to the overall approach. =)

kennytm commented 6 years ago

The prototype implementation works like this (note that #[track_caller] was called #[implicit_caller_location] there):

  1. First there will be an AST transformation pass (implemented as an attribute proc-macro in src/libsyntax_ext/implicit_caller_location.rs) which expands:

    #[track_caller]
    #[...attributes...]
    fn foo(...) -> T {
        /* code... */
    }

    into

    #[rustc_track_caller]
    #[inline]
    #[...attributes...]
    fn foo(...) -> T {
        let __closure = |__location| { /* code... */ };
        FnOnce::call_once(__closure, intrinsics::caller_location())
    }

    The purpose of this pass is to conveniently make up a new DefId.

  2. Create a new MIR inlining pass (main implementation in src/librustc_mir/transform/inline.rs). This pass does two things:

    1. Force-inline any functions calls foo() where the attributes of the callee foo contains #[rustc_track_caller].
    2. If intrinsics::caller_location() is called :-

      • If it is used from the __closure, replace it by the argument __location. This is to propagate the caller location.
      • Otherwise, replace the call with the caller's location span.
  3. Define the caller_location() intrinsic.

    fn caller_location() -> core::panic::Location<'static>;
  4. Because Location now needs to be known by the compiler, make the Location struct a lang-item.

  5. Update the standard library to use track-caller features:

    1. Change panic! to use caller_location() (or its safe counterpart, Location::caller())
    2. Add #[track_caller] to unwrap() and expect() etc.
  6. Implement the -Z location-detail flag (search for location_detail) so that the filename/line/column can be redacted.

eddyb commented 6 years ago

I think a syntactical transformations is unnecessary because we can instead change the "direct call ABI" (vs reifying to a fn pointer, including trait vtables via miri, which could go through a MIR shim). MIR inlining should also not be needed.

kennytm commented 6 years ago

@eddyb So you are suggesting to change the extern "Rust" ABI to always pass an extra argument?

eddyb commented 6 years ago

@kennytm Only for functions declared with the attribute and called through static dispatch, everything else would use a shim when reifying (which I think we do already in some other cases).

kennytm commented 6 years ago

@eddyb Maybe you could write down some mentoring instructions i.e. which files to look at 😊

nikomatsakis commented 6 years ago

@eddyb do it! do it! I want this feature.

Centril commented 5 years ago

@eddyb How are the mentoring instructions coming along?

eddyb commented 5 years ago

Oh, I don't recall exactly what happened here.

For making static calls do something different, src/librustc_codegen_ssa/mir/block.rs is the place you need to change, while for shims, src/librustc_mir/shim.rs is where their MIR is computed.

But the rest of the pieces, I don't know off the top of my head where they happen.

54183 added a virtual-call-only shim, which is close to what this needs, so that can be used for inspiration, but here we need both reification to fn pointers and virtual calls.

We could start by disallowing reification/virtualization of such functions, and only implement the static dispatch ABI changes.

arielb1 commented 5 years ago

We could start by disallowing reification/virtualization of such functions, and only implement the static dispatch ABI changes.

I don't expect this attribute to be usable on "real" functions without us implementing reification.

But for implementing this, I think one good strategy would be:

Not supporting trait methods

For now, I don't think there is sufficient reason to support implicit caller location on trait methods, and supporting that is fairly complicated. So as @aturon said on the head PR, just don't do it.

Step 0: dealing with the attribute.

You should make sure that the #[blame_caller] attribute "works". This means that:

  1. It emits a feature-gate error unless the correct feature-gate is enabled (see the feature guide for more information about feature gates).
  2. It emits an error when used with odd syntax (e.g. #[blame_caller(42)]),
  3. It emits an error when used on something that is not a free fn/inherent impl fn.
  4. It does not emit an error when used "correctly'.
  5. As the RFC says: using #[naked] or extern "ABI" together with #[rustc_implicit_caller_location] should raise an error.

I'm not particularly sure what's the exact way to do this, but you can maybe find some PR that implemented an attribute for that, or ask on Discord.

Make sure to add tests.

Step 1: have reified methods and direct calls go to different LLVM functions

At this stage, I won't even add a parameter, just have the call-sites go through different paths.

So the virtual-call-only shim at #54183 is a good model for how things need to be done. You'll need to add a ReifyShim that is similar to VirtualShim in behavior (see that PR). You'll probably need to split ty::Instance::resolve to ty::Instance::resolve_direct and ty::Instance::resolve_indirect (in addition to the already-existing ty::Instance::resolve_vtable) and adjust the call-sites. Have resolve_indirect return the shim if it is needed (i.e., when calling a #[blame_caller] function).

Step 2: Add a location parameter

Step 2.0: add a location parameter to the MIR of #[blame_caller]

MIR construction is in rustc_mir::build. There's already code there handling implicit arguments, which you could work with.

You'll need to fix ty::Instance::fn_sig to also contain the new type.

Step 2.1: add an "empty" location parameter to calls from the shim.

Change the shim you added in Step 1 to pass an "empty" location parameter - some "dummy" that represents a call that uses a shim. See the other shims for how to generate things in shims. Ask @eddyb if you are fighting const eval.

Step 2.2: add a value to the location parameter to direct calls.

Find all the calls to ty::Instance::resolve_direct you added previously, and make them pass the "correct" location parameter to the function. These callsites can poobably already modify parameters because of InstanceDef::Virtual, you should be able to just follow that.

Make sure that when #[track_caller] calls are nested, you pass the location of your call-site, rather than your location. Add a test for that. Also add a test that when one of the calls in the middle is not #[track_caller], the call-site is not passed.

You should probably add a function on ty::Instance that tells you whether you need to add the caller location.

Step 3: wire up intrinsic::caller_location

See some other intrinsic for how to type-check it. Make sure it can't be called from functions that aren't #[trace_caller]!

I think it might even be the best to do the wiring in MIR construction, like the move_val_init intrinsic - that way you wouldn't even have to convince MIR optimizations not to dead-code-eliminate your new parameter or something evil like that, and to make it not be affected by "plain" MIR inlining you'll only have to change the ty::Instance::resolve_direct callsite there (you did make MIR inlining use resolve_direct, did you?).

Step 4: You have it!

Now you only need to implement the library support.

flavius commented 5 years ago

Is anyone working on this?

ayosec commented 5 years ago

If nobody is working on this I would like to try to implement it.

Do you think that it is feasible as a first contribution to the compiler?

Are the steps in https://github.com/rust-lang/rust/issues/47809#issuecomment-443538059 still valid?

cramertj commented 5 years ago

Yes, those steps still sound reasonable. This would be a difficult first contribution, but if you ask questions on https://rust-lang.zulipchat.com/# and ping @eddyb and @arielb1 I'm sure you'd be able to start making progress, and there are likely others who could help out.

ayosec commented 5 years ago

I'm sorry for the delay. I couldn't have time to work on this until now =\

I have completed the step 0 described in https://github.com/rust-lang/rust/issues/47809#issuecomment-443538059 . The progress is available in my branch, though I guess that I will have to wait until the pull-request is open to know is everything is OK.

About error codes: if I understand correctly, every error emitted by the compiler has its unique code, so I have to register a new E0### for every error. I assume that the final code is assigned only when patch is ready to be merged into master, so I'm using dummy codes (E0900, E0901, ...) for the new errors.

anp commented 5 years ago

I'm interested in helping with this, are you still working on it @ayosec? If so, is there any way for me to contribute?

ayosec commented 5 years ago

are you still working on it @ayosec?

Yes. I'm sorry for the lack of updates. During August I had almost no time, and I'm trying to start to continue now. At this moment I'm working with the shim to add the location parameter.

If so, is there any way for me to contribute?

Right now I spend most of the time reading and understanding how everything works, so there is not so much code to write.

Something that would be very helpful is writing the documentation for the new errors (like this or this, at the moment), since I'm not very fluent in English.

oli-obk commented 5 years ago

Ping @ayosec do you have time to address this or is it OK with you if someone else takes over?

ayosec commented 5 years ago

I'm still working on it very slowly, but feel free to take it if you want. I can try to implement another issue once I get more time.

anp commented 5 years ago

I'm eager to pick this up, opened a PR starting from your branch.

@Centril recommended splitting this into a few PRs:

eddyb commented 5 years ago

add a location parameter to MIR of #[track_caller] in rustc_mir::build (reference existing implicit arguments work if needed)

A good example of this that landed recently is the VaList you get for something like this:

extern "C" fn foo(x: T, args: ...) {}

There it's c_variadic in the signature, instead of an attribute, but it's similar in that HIR and MIR bodies have one more input than the signature and a few places have to account for it (see #63492).

anp commented 5 years ago

(replacing this todo list with a db paper link to the implementation notes I'm keeping: https://paper.dropbox.com/doc/rfc-2091-impl-notes--Am~iTmS_9UmQVUK63kwMYAQDAQ-rwCdRc2fi2yvRZ7syGZ9q)

cramertj commented 5 years ago

If I understand stability in std correctly, we can't migrate panic! to use Location::caller() until the latter has stabilized. Is this correct?

No, it's possible to use unstable functionality from inside std (e.g. inside the panic! macro).

anp commented 5 years ago

Is it different for macro expansions? I would naively assume so.

cramertj commented 5 years ago

No-- let's follow up on this zulip thread.

anp commented 5 years ago

Awesome, thanks for clarifying! I edited the library support todo list appropriately.

anp commented 5 years ago

I have a PR ready for review implementing core::intrinsics::caller_location and preparing the panicking infrastructure to use its return value. This lays the groundwork for propagating caller location as a single pointer-sized implicit variable through #[track_caller] functions, which is up next.

I'm using this paper doc for tracking.

anp commented 4 years ago

The PR I linked above has been approved and is towards the top of the homu queue (fingers crossed!). I've opened a PR following that up with the start of a real implementation for the #[track_caller] attribute. Right now it only includes a functional const implementation and failing tests for a not-yet-extant codegen implementation.

comex commented 4 years ago

Awesome work.

Could we also add a method like Location::current() that always returns the location of the code that called Location::current(), even if it happens to be within a #[track_caller] function?

That way, I think e.g. line!() could be replaced with a real macro that expands to Location::current().line().

Admittedly, although it sounds straightforward, it seems like it might be awkward to implement, given the way we wrap intrinsics in wrapper functions.

eddyb commented 4 years ago

It's probably far too heavy for line!(), but I think (|| Location::caller())() would work even in a #[track_caller] function.

anp commented 4 years ago

The PR implementing the attribute is on its way to land. I'm working on a follow-up to lay the groundwork for annotating functions in std, hopefully I can land that before the next beta (and thus bootstrap) branch cut. It would be nice to be able to complete some of the std additions during the next release cycle.

Update: the PR I linked above has landed! I'll pull tonight's nightly in the morning and try it out on my out of tree test cases :D.

anp commented 4 years ago

I have another PR open which I think will be the last one before we can start annotating the standard library. Ideally we can get it in before the next beta branch cut, as that will also allow us to un-cfg all of the existing non-bootstrap-compatible code in core and std.

eddyb commented 4 years ago

I don't think we need to get anything into beta other than the compiler changes (which all landed other than that miri change you mentioned in a comment).

That is, once beta is promoted, anything that is cfg(not(bootstrap)) on beta can/needs to become the only variant on master, because cfg(bootstrap) on master is a fully bootstrapped and released beta.

nikomatsakis commented 4 years ago

Nominating for lang team meeting to update about the awesome progress here, that's all.

anp commented 4 years ago

The PR to have panic! use the new intrinsic always has landed! I'm working on a branch now to annotate these functions in std:

The RFC also listed several implementations of Index and IndexMut, but it wasn't approved because of implementation limitations:

This RFC only advocates adding the #[track_caller] attribute to the unwrap and expect functions. The index and index_mut functions should also have it if possible, but this is currently postponed as it is not investigated yet how to insert the transformation after monomorphization.

Thing is, the current implementation is significantly different from the approved RFC's proposal and actually is compatible with trait methods to my knowledge (it is after monomorphization IIUC).

I'm interested in adding the attribute to the indexing functions, should I submit an "amended RFC" describing the new implementation and seek consensus on applying the attribute to trait methods? I've already thought about doing that since the implementation has deviated so much from the original proposal.

anp commented 4 years ago

Have that PR open with the panic messages working!!!

https://github.com/rust-lang/rust/pull/67887

tesuji commented 4 years ago

About the code bloat drawback in the RFC, would setting RUST_BACKTRACE=1 make it unnoticeable?

anp commented 4 years ago

About the code bloat drawback in the RFC, would setting RUST_BACKTRACE=1 make it unnoticeable?

I don't think so, because RUST_BACKTRACE has effect at runtime, not compile time IIUC.

anp commented 4 years ago

Alright, https://github.com/rust-lang/rust/pull/67887 has landed! Going to manually verify everything is working with tomorrow's nightly.

@nikomatsakis what do you think should be the process to form a plan for #[track_caller] in trait impls? Mentioned above. I believe it's the only remaining significant question that would block stabilization after a couple of release cycles.

nikomatsakis commented 4 years ago

@rustbot modify labels to +I-Nominated

Nominating for lang-team discussion on @anp's final question. Does anyone object to extending #[track_caller] to work in trait impls. Apparently the new discussion strategy (which I am still investigating) can handle this just fine.

My opinion: we should definitely do it! But I would like to see the new impl strategy written up for the rustc-guide, as I'd like to know how it works. (I'm not sure how much of that belongs in the reference, too? At least some of it.)

anp commented 4 years ago

Re: a write-up, I planned to write about the attribute for the reference and to write some info about the implicit arguments for the rustc-guide. I had considered these important for stabilization and am going to do them regardless. If that's all that's required to finish the last bit of proposed work from the RFC, I'm excited!

(edit: the book -> rustc-guide)

nikomatsakis commented 4 years ago

@anp

write some info about the implicit arguments for the book.

what is "the book" in that sentence?

One thing I would really encourage is that you document the high-level picture of the impl for rustc-guide.

nikomatsakis commented 4 years ago

We discussed this in our lang team meeting today. We approve of extending this to trait items. This was based on an understanding that it falls out fairly natural from the implementation -- I gave a summary of what I understand based on skimming the PRs plus earlier conversations with @eddyb.

anp commented 4 years ago

Ah yes, i meant rustc-guide when I said ‘the book’ - I’ve been referring to it much more than TRPL and forgot myself 🤣.

eddyb commented 4 years ago

This example has regressed (including on stable by now): (playground)

fn main() {
    std::collections::VecDeque::<String>::with_capacity(!0);
}

The location reported is now <::core::macros::panic macros>:3:10. (the choice of VecDeque::with_capacity is arbitrary, it just happens to have an assert! inside, and it will be codegen'd in the user crate because it's generic)

In general, we don't track enough information to recover the invocation site, when we have to codegen the MIR cross-crate. @Aaron1011 made me realize this is a problem in https://github.com/rust-lang/rust/pull/68965#issuecomment-583781140.

A cheap hack we could do is replace the macro span with the invocation one, when serializing Spans inside MIR (or Spans in general?).

cc @petrochenkov

anp commented 4 years ago

Status update:

  1. the bug @eddyb mentioned in https://github.com/rust-lang/rust/issues/47809#issuecomment-583784847 has a proposed fix in https://github.com/rust-lang/rust/pull/68965. I don't think there's any help I can offer, but I'm available if there is.
  2. I've opened a tentative PR to the reference so that docs are ready and reviewed whenever stabilization happens.
  3. I've opened a draft PR to rustc-guide. I have some more handwritten notes to add to it and will ask for review when it's fleshed out.
  4. I've opened a PR to allow the attribute in traits and adding it to indexing traits in std.
  5. I believe once the trait impl lands that all unresolved questions from the RFC will be resolved. Woo!

I'll be keeping those PRs warm and adding to the rustc-guide section for now. Modulo new bugs, I'm hoping that we'll be able to let this bake for a couple of releases and then the next active task will be a stabilization report. Am I missing anything?

eddyb commented 4 years ago

the bug @eddyb mentioned in https://github.com/rust-lang/rust/issues/47809#issuecomment-583784847 has a proposed fix in #68965

No, it was brought up on #68965, but the fix is to track hygiene info cross-crate, which last I heard @Aaron1011 was working on.

Aaron1011 commented 4 years ago

I have some code locally which serializes hygiene information, but it's currently blocked on https://github.com/rust-lang/rust/pull/68941

petrochenkov commented 4 years ago

https://github.com/rust-lang/rust/issues/47809#issuecomment-583784847

[triagebot] I don't know anything about the treatment of spans in MIR inlining.