rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.82k stars 12.66k forks source link

Doc comments can be passed to macros as literals #61001

Open petrochenkov opened 5 years ago

petrochenkov commented 5 years ago

Doc comments are the only tokens that cannot be passed to macros precisely, they are converted into a #[doc = "text"] form instead, which is a pretty big hack. This conversion may change escaping in the text irrecoverably and can also change semantics in corner cases (e.g. doc starts going through name resolution). Also, this is the single reason why doc comments may need a conversion at all (https://github.com/rust-lang/rust/issues/60935).

Lexically doc comments are raw string literals with weird quotes (/** + */ and /// + \n). Syntactically they certainly can be interpreted as attribute literals, similarly to how strings or integers are expression literals, and how ! would be a type literal if it wasn't a punctuation lexically. We can use this intuition for fixing the situation with passing doc comments to macros.

Declarative macros

The recently introduced literal matcher can start matching doc comments.

macro m($doc: literal) {
    $doc
    struct S;
}

m!(/** text */);

I don't think there are going to be any implementation issues with that.

Proc macros

We cannot add a new variant to TokenTree backward compatibly, but the content of TokenTree::Literal is only available through to_string() and is open for additions.

So, literal.to_string() can start returning things like /// Text. Of course, syn and friends must be ready to get a result like this from stringifying a literal token.

Centril commented 5 years ago

How does this plan account for e.g.

macro_rules! foo {
    ($(#[$m:meta])* $i:ident) => {
        $(#[$m])*
        pub struct $i;
    }
}

foo!(
    /// Stuff
    /// More stuff
    Alpha
);

This idiom is used in various places today, including in the compiler, and presumably must continue to work.

Having read the issue, I'm not fully sure what is proposed... Do you just want to make /// Foobar be accepted by the literal matcher?

Of course, syn and friends must be ready to get a result like this from stringifying a literal token.

cc @dtolnay

petrochenkov commented 5 years ago

@Centril

#[$m:meta]

This still needs to work, similarly how "foo" still matches expr after introduction of literal.

Do you just want to make /// Foobar be accepted by the literal matcher?

Yes, I updated the issue with an example.

Centril commented 5 years ago

So assuming nothing breaks and the upsides in terms of perf are notable with this approach, I don't have any objections re. making doc comments into literals (tho it might seem marginally weird from a user non-compiler-dev POV). Before seeing some numbers it's hard to say whether we should do it or not however.

petrochenkov commented 5 years ago

My primary motivation is ability to pass all tokens around losslessly, the perf benefit is secondary.

dtolnay commented 5 years ago

it might seem marginally weird from a user non-compiler-dev POV

Confirmed.

I would like to push back on this. For both declarative macros and proc macros, I prefer the current behavior over what is proposed. We already have experience with the proposed model because this used to be how doc comments in proc macros worked around 1.26-nightly, and it was a big mess for macros. We intentionally switched to the current system in #49545.

Is there any other way that the lossiness cases can be addressed?

This conversion may change escaping in the text irrecoverably

Could you give an example of a doc comment that would be lossy?