rust-lang / rust-analyzer

A Rust compiler front-end for IDEs
https://rust-analyzer.github.io/
Apache License 2.0
14.2k stars 1.59k forks source link

Incomplete syntax handling for proc-macros produces invalid `TokenStream` with `{`/`}` chars as a `Punct` rather than a `Group` #18244

Open Veetaha opened 2 weeks ago

Veetaha commented 2 weeks ago

rust-analyzer version: rust-analyzer version: 0.3.2129-standalon

rustc version: rustc 1.81.0 (eeb90cda1 2024-09-04)

editor or extension: VSCode v0.3.2129

I've stumbled on a bug when using my proc-macro #[bon::builder].

The minimal reproduction of this is:

#[my_lovely_macro]
fn example() {
    if 1
}

where my_lovely_macro is defined as this

use proc_macro::{TokenStream, TokenTree};

#[proc_macro_attribute]
pub fn my_lovely_macro(_params: TokenStream, item: TokenStream) -> TokenStream {
    let tokens: Vec<_> = item.clone().into_iter().collect();
    let TokenTree::Group(fn_block) = &tokens[3] else {
        unreachable!();
    };
    let fn_block_tokens: Vec<_> = fn_block.stream().into_iter().collect();

    if fn_block_tokens.len() > 2 {
        // This is the case where RA inserted a dummy {} block
        panic!("{fn_block_tokens:#?}");
    }

    item
}

If you take a look at the panic message from RA hints

it looks like this:

proc-macro panicked: [
    Ident {
        ident: "if",
        span: SpanData { range: 38..40, anchor: SpanAnchor(FileId(16777216), 2), ctx: SyntaxContextId(0) },
    },
    Literal {
        kind: Integer,
        symbol: "1",
        suffix: None,
        span: SpanData { range: 41..42, anchor: SpanAnchor(FileId(16777216), 2), ctx: SyntaxContextId(0) },
    },
    Punct {
        ch: '{',
        spacing: Alone,
        span: SpanData { range: 0..0, anchor: SpanAnchor(FileId(16777216), 4294967294), ctx: SyntaxContextId(0) },
    },
    Punct {
        ch: '}',
        spacing: Alone,
        span: SpanData { range: 0..0, anchor: SpanAnchor(FileId(16777216), 4294967294), ctx: SyntaxContextId(0) },
    },
]

Notice how there are two Punct items for { and } braces in this list? This is an invalid token tree. syn fails to parse this syntax, because curly braces are never meant to be Punct characters, they must always be balanced and they are only expected as part of a Group token tree.

This produced a panic in my bon::builder macro visible to the user writing the code if they have an incomplete if somewhere while writing the function body (the entire function is underlined in red and no IDE hints are provided). I'll fix that panic in bon separately to make my macro more bug-resilient to situations like this, but RA should also fix this.

ChayimFriedman2 commented 2 weeks ago

This probably comes from parser recovery. We need to either make it affect macro tt well, or better, not insert non-existing tokens.

flodiebold commented 2 weeks ago

It's not parser recovery; it's this: https://github.com/rust-lang/rust-analyzer/blob/5982d9c420d0dc90739171829f0d2e9c80d98979/crates/hir-expand/src/fixup.rs#L151-L161 (Original code by me, FIXME added by @Veykril and he's of course correct ;) )

Veetaha commented 2 weeks ago

Some more context.

While digging a bit more I also uncovered a bug in proc_macro2 crate (https://github.com/dtolnay/proc-macro2/issues/470).

The standard library's proc_macro::Punct::new() explicitly panics if you pass an incorrect character to that method (see the code here). I suppose RA creates the proc_macro::Punct somehow unsafely by just relying on the Punct ABI layout, otherwise it's impossible to create a Punct with { and } characters.

The problem is worsened by the bug in proc_macro2 (https://github.com/dtolnay/proc-macro2/issues/470), which doesn't panic early when it encounters an invalid Punct.

ChayimFriedman2 commented 2 weeks ago

@rustbot release-assignment

This turned out way more nightmare-y than I thought and I have other things to do.

If anybody is interested in taking my work, my almost complete branch is on GitHub: https://github.com/ChayimFriedman2/rust-analyzer/tree/punct-brace. It passes all tests, but has a bug as the commit message explains. A simple fix may be to never omit a delimiter (and thus make bugs like this still possible, just more rare).