Breaking change in highlighting of code under a proc-macro

Veetaha commented 3 weeks ago

This breaking change happened during the transition from version v0.4.2159 -> v0.4.2160. I tested this by switching between different "Pre-release" versions in VSCode:

code snippet to reproduce:

Add this proc macro as a dependency (using a master git dependency is required):

bon = { git = "https://github.com/elastio/bon" }

#[bon::builder]
fn example(arg: u32) {
    let _ = arg;
}

Code highlighting in v0.4.2159 works correctly:

Code highlighting in v0.4.2160 is broken:

The problem is that the arg symbol is assigned the semantic token type of an enum:

Internally (ignoring all other generated code), the macro generates the following code, where the arg symbol is used as a name for an enum, however, that enum declaration appears syntactically higher in code:

mod example_builder {
    mod members {
        pub(in super::super) enum arg {}
    }
}

fn __orig_example(arg: u32) {
    let _ = arg;
}

Previously I relied on the fact that rust-analyzer assigned the semantic token type to the symbol based on the last occurence of its span in the generated code, so I intentionally put the fn __orig_example(arg: u32) { /**/ } at the end of the proc macro out so I'm sure RA doesn't mess up the syntax highlighting, because the original function is always at the end thus spans of its symbols will guarantee correct highlighting.

ChayimFriedman2 commented 3 weeks ago

This isn't exactly a bug, we don't guarantee anything about token classification, it's all heuristics and it's very possible they have changed lately. This can be a question, though.

Veetaha commented 3 weeks ago

Indeed, I guess the question is "how can I ensure consistent highlighting"? I suppose https://github.com/rust-lang/rust-analyzer/pull/18410 may be related.

cc @Veykril

Veetaha commented 3 weeks ago

I think I've solved this for now by reordering the generated items. It seems like now the rule is that "the first occurrence of the span wins" rather than "the last". I just want to make sure this doesn't change in the future and users of bon don't suddenly start seeing broken highlighting

ChayimFriedman2 commented 3 weeks ago

I don't think you can guarantee that, nor I think we want to guarantee that. If we will find a better heuristic, we will use it, and we don't want to be limited by previous guarantees. You can either accept that your macro may show the incorrect semantic highlighting, or try to follow changes here, but still accept that for some times, user may see incorrect highlighting.

Reporting a bug where our heuristics don't work is fine, and we may succeed in finding better heuristics. But we don't want to guarantee a way you will be able to constrain our classification.

A permanent fix for that may be to have some explicit way for macros to override token classification.

Veykril commented 3 weeks ago

Yes we don't guarantee the order her, but we shouldn't have a reason to change it again in my eyes (in order to me sounds more like what people would expect instead of reverse order) either (unless it turns out in the next couple weeks that this worsened the majority of proc-macros out there). Ideally we'd have

A permanent fix for that may be to have some explicit way for macros to override token classification.

but that design space to be explored (and also commitment to exposed API which is always tough.

ChayimFriedman2 commented 3 weeks ago

I wonder if the lang team will accept additional API to proc_macro that takes a span and marks it with a classification (from a #[non_exhaustive] enum).

If not we can probably inject it somehow it the resulting code, but it will be uglier.

lnicola commented 3 weeks ago

Would emitting a custom (tool) attribute in front of the "main" usage in the token stream work?

mod example_builder {
    mod members {
        pub(in super::super) enum arg {}
    }
}

fn __orig_example(#[rust_analyzer::main_usage] arg: u32) {
    let _ = arg;
}

or something like that.

Veykril commented 3 weeks ago

Would emitting a custom (tool) attribute in front of the "main" usage in the token stream work?

mod example_builder { mod members { pub(in super::super) enum arg {} } }

fn __orig_example(#[rust_analyzer::mainusage] arg: u32) { let = arg; }

or something like that.

A tool attribute is what I would imagine as well. I don't think the proc-macro API wants to bother with this (the proc-macro implementation is already really complicated and the compiler wouldn't benefit from this whatsoever)

(Though if we had stable definition site hygiene one could also use that to instruct r-a to ignore mappings, if def site hygiene is appropriate for a given usage)

davidbarsky commented 3 weeks ago

I think it might be useful to spin up an issue tracker along the lines of the existing client-facing changes issue tracker, but aimed specifically at proc macro authors? Tangentially related, should rust-analyzer consider something like https://github.com/intellij-rust/intellij-rust/pull/9711?

lnicola commented 3 weeks ago

https://github.com/search?type=code&q=RUST_IDE_PROC_MACRO_COMPLETION, btw

rust-lang / rust-analyzer

Breaking change in highlighting of code under a proc-macro #18438