Open fmease opened 2 weeks ago
I've tentatively assigned myself but I'm not sure if I will actually tackle this. Not before #126597 is merged, that's for sure.
Priority set to low because this should be turned into an upstream issue over at https://github.com/rust-lang/annotate-snippets-rs since I want to prioritize migrating rustc to annote-snippet-rs (#59346) instead of messing with rustc's emitter.rs
. Note that --error-format=human-annotate-rs -Zunstable-options
currently panics on the very input I gave above.
Found while reviewing #126597 for the nth and final time. Many parts of the emitter look fishy to me wrt. Unicode handling and to be honest it was quite frustrating to review code related to it because the emitter doesn't make any attempt to newtype the different units used, namely byte lengths/offsets, char / Unicode scalar lengths/offsets and Unicode widths (I hope rust-lang/annotate-snippets-rs will remedy that). That's just an aside.
The string offset acrobatics performed in
HumanEmitter::render_source_line
andHumanEmitter::draw_line
look incredibly suspicious to me. Let me just link some parts where we likely incorrectly reinterpret different units (byte len, char count, Unicode width):https://github.com/rust-lang/rust/blob/c22887b4d97400e8e024e19fb5f724eda65ad58d/compiler/rustc_errors/src/emitter.rs#L733-L736
https://github.com/rust-lang/rust/blob/c22887b4d97400e8e024e19fb5f724eda65ad58d/compiler/rustc_errors/src/emitter.rs#L663-L681
I gave up trying to make sense of this – having to look at all the weakly typed variables and fields of type
usize
. However, based on these functions I crafted a pathological input file where it's clear something is amiss.Example Reproducer
Compiler Output
Clearly butchered:
Counterexample
Compare this to ASCII-only input:
Compiler Output
Perfectly sensible: