`$$crate` unexpectedly fails to defer which crate `$crate` refers to

CAD97 commented 2 years ago

$$ was stabilized in #95860 for 1.63. (reverted in #99435; now requires feature gate again.)

I tried this code:

// lib.rs
pub const WHICH: &str = "lib.rs";

#[macro_export]
macro_rules! define_which_macro {
    () => {
        macro_rules! which {
            () => {
                $$crate::WHICH
            };
        }
    };
}

// main.rs
use lib::define_which_macro;

pub const WHICH: &str = "main.rs";

define_which_macro!();

fn main() {
    dbg!(which!());
}

I expected to see this happen:

My intuitive expectation comes from expanding the macros as following:

define_which_macro!();
fn main() {
    dbg!(which!());
}
// =====>
macro_rules! which {
    () => {
        $crate::WHICH
    }
}
fn main() {
    dbg!(which!());
}
// =====>
fn main() {
    dbg!(crate::WHICH);
}

lib provides a macro whose expansion is specified to contain $$ crate; this seemingly should behave as writing $crate in the crate where the macro is expanded; i.e., main.rs should be printed.

Instead, this happened:

lib.rs is printed: the $crate in the expansion of which!() refers to lib.

This likely happens because the $crate compound identifier token refers to the crate indicated by the crate token's span, as indicated by the following example printing main.rs:

// lib.rs
#[macro_export]
macro_rules! define_which_macro {
    ($krate:tt) => {
        macro_rules! which {
            () => {
                $$$krate::WHICH
            };
        }
    };
}

// main.rs
define_which_macro!(crate);
fn main() {
    dbg!(which!());
}

and the following printing lib.rs:

// lib.rs
#[macro_export]
macro_rules! define_which_macro {
    ($d:tt) => {
        macro_rules! which {
            () => {
                $d crate::WHICH
            };
        }
    };
}

// main.rs
define_which_macro!($);
fn main() {
    dbg!(which!());
}

Desired behavior

This seems difficult to resolve. The stated purpose of $$ is to make writing macro-expanded-macros significantly more tractable than the previous [$dollar:tt] pattern defining a manual binder expanding to $. As such, it's fairly clearly the intent that writing $$crate should allow you to expand to a macro using $crate to refer to its defining crate, not the root macro's crate. (If you want the root macro's crate, you can already use just $crate just fine.)

Even though $$ is unstable, though, refining this behavior might be edge-case breaking, as you could already observe the current behavior of $crate's crate choice by using a manual $dollar:tt binder to get a $ into the expanded output.

I have three concepts for how to make $$crate work as is likely intended:

`$crate` uses the `$` token's crate

... and $$'s expanded $'s crate gets set to the crate which expanded it, not the crate of either input $.[^1]

[^1]: I have no idea what span the expanded $ gets currently. The "correct" choice is probably to take the span of the second input $ as-is.

This changes the behavior of $dollar $krate to refer to $dollar's crate rather than $krate's crate.

`$crate` uses the "gluing" crate

When the $crate token is glued into a single ident token, it gets its crate set to the crate def which did the gluing.

This changes the behavior of the $dollar $krate to refer to the gluing crate rather than $krate's crate.

Wait, which is the "gluing" crate then?

There's two options here:

The crate for the macro_rules! name's token. This means that using a known name like in the original example would make $$crate behave the same as $crate, but would make most macro-defining-macros which take the name of the macro to define (or glue based on an input token using paste!, which respects the span of the first used binder) work as expected and produce a $crate referring to the crate which provided the name of the newly-defined macro. This also gives an easily predictable answer to what crate is referred to when multiple levels of macro indirection are involved.
The crate the macro name is registered into the namespace of. It is the author's opinion that this is the most useful option, as any other macro involved can insert a normal $crate token to refer to their namespace, and cannot refer to the crate any defined items end up in unless using a properly escaped $$crate refers to that crate.

Really kludge it just for `$$`

$$ expands into a token that behaves like $ except for the fact that $crate behaves like one of the previous two solutions when using a $$-created $.

This avoids changing the behavior of $dollar $krate, but makes $dollar $krate behave differently from $$ $krate.

Meta

rustc --version --verbose:

rustc 1.64.0-nightly (27eb6d701 2022-07-04)
binary: rustc
commit-hash: 27eb6d7018e397cf98d51c205e3576951d766323
commit-date: 2022-07-04
host: x86_64-pc-windows-msvc
release: 1.64.0-nightly
LLVM version: 14.0.6

CAD97 commented 2 years ago

$crate is taught as

Within a macro imported from a crate named foo, the special macro variable $crate will expand to ::foo. [old 1.5 edition of The Book]

Hygiene is also the reason that we need the $crate metavariable when our macro needs access to other items in the defining crate. What this special metavariable does is that it expands to an absolute path to the defining crate. [The Little Book of Rust Macros]

Both of these are subtly wrong: $crate "expands" (really: is glued into) a single identifier, not a (partial) path, and this is observable by calling macros in a macro expansion. However, they agree that $crate is semantically a "reserved binder" which expands to the crate that the containing macro_rules! is defined in.

What crate the macro is defined in is somewhat ambiguous for a macro expanded macro, but treating $crate more like a reserved binder and less like a written token (resolution option: $crate uses the "gluing" crate) would be more consistent with how $crate is conceptualized and taught today.

CAD97 commented 2 years ago

Related: https://github.com/rust-lang/rust/issues/99037 is another issue resulting from $crate's treatment as a glued token rather than a reserved metevariable/binder.

Mark-Simulacrum commented 2 years ago

Nominating for T-lang to make a call on whether we should do anything here before 1.63 releases, since there's a relatively short time window there.

joshtriplett commented 2 years ago

@CAD97 We discussed this in today's @rust-lang/lang meeting. We do agree that the behavior is surprising, and it'd be worth reverting so we have time to determine the right behavior.

joshtriplett commented 2 years ago

Speaking for myself only: I think for the macros-writing-macros case, it makes more sense to me that $$crate should turn into $crate at the site of expansion, so whatever macro invokes that macro should be the one whose $crate is used.

c410-f3r commented 2 years ago

https://www.youtube.com/watch?v=roRQ2mNwMMQ

Mark-Simulacrum commented 2 years ago

Going to tag as a regression so it gets tracked appropriately as something that needs work before the next release.

steffahn commented 2 years ago

@CAD97 In the OP

and the following printing main.rs:

Is this supposed to say “and the following printing libs.rs”? (I haven’t tested the behavior, but you appear to be trying to describe two code examples with different behavior.)

CAD97 commented 2 years ago

@steffahn oops, yes, you're correct. (I'll triple check when I get a chance but yes, the current behavior is that the $crate glued ident uses the span of the crate token.) (EDIT: got a chance: confirmed that the OP is now correct in saying the second example prints lib.rs.)

apiraino commented 2 years ago

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-medium T-compiler

steffahn commented 2 years ago

Regarding the idea of fixing this with something like #99193...

This is not actually a regression, right? It's undesired behavior in a new feature.

I don't fully understand how this is sufficient reason for a beta backport of - essentially - changes to the feature; IMO the proper approach would be to revert stabilization of $$ entirely and apply the desired changes on nightly for release in no-earlier-than 1.64 (even if those changes are only to restrict $$'s interaction with $crate in some way). Doing design work (even if it's the design of new restrictions on a to-be-stabilized feature) always has the downside that there's less time left for testing the new behavior.

I don't know if there's an official source of what is or isn't “allowed” in a beta backport, so perhaps my opinion here is irrelevant, and this approach is actually appropriate.

Mark-Simulacrum commented 2 years ago

FWIW, I tend to agree that the preferred approach would be to back out the stabilization on beta; we can separately land #99193 for future versions (1.64+).

CAD97 commented 2 years ago

Relevant:

https://github.com/rust-lang/rust/blob/ed9173276a176126150b4c684a4262a135ce51ef/compiler/rustc_resolve/src/lib.rs#L1724-L1731

This suggests that the current crate-assignment behavior of $crate is known not to work correctly for $$crate. Unfortunately, the comment is slightly incorrect in saying that such configurations are not stably accessible, as $d crate can hit the case that it doesn't handle correctly (and that is hit more easily with $$crate).

I think fixing this ($crate refers to the local crate of the expansion which glued it) may be as simple as the following diff I just wrote:

```rust diff --git a/compiler/rustc_expand/src/mbe/quoted.rs b/compiler/rustc_expand/src/mbe/quoted.rs index d4b8563a036..d7049eb28a5 100644 --- a/compiler/rustc_expand/src/mbe/quoted.rs +++ b/compiler/rustc_expand/src/mbe/quoted.rs @@ -9,7 +9,7 @@ use rustc_span::symbol::{kw, sym, Ident}; use rustc_span::edition::Edition; -use rustc_span::{Span, SyntaxContext}; +use rustc_span::{ExpnId, Span, SyntaxContext}; const VALID_FRAGMENT_NAMES_MSG: &str = "valid fragment specifiers are \ `ident`, `block`, `stmt`, `expr`, `pat`, `ty`, `lifetime`, \ @@ -221,6 +221,12 @@ fn parse_tree( let (ident, is_raw) = token.ident().unwrap(); let span = ident.span.with_lo(span.lo()); if ident.name == kw::Crate && !is_raw { + // If we use `ident.span` here, $crate will refer to the crate where + // the crate token appears; we want it to refer to the crate which + // is actually defining this macro (the current local crate). This is + // most easily accomplished with a macro expanded macro_rules! which + // uses `$$ crate` to get a `$crate` in the expanded macro definition. + let span = span.with_call_site_ctxt(ExpnId::root()); TokenTree::token(token::Ident(kw::DollarCrate, is_raw), span) } else { TokenTree::MetaVar(span, ident) ```

but of course this doesn't work as-is (this apply_mark is directly forbidden), I'll need to dig into this further.

Mark-Simulacrum commented 2 years ago

Untagging as regression from 1.63, as we reverted in https://github.com/rust-lang/rust/pull/99435 (and that PR was backported).

rust-lang / rust