package infinite_loop
main :: proc() {
aa := aaaaa.aaaaaaaaaaaaaaa(aaa, aaaaaaaaaaaaaa, aaaaaaaaaaaaaaa)
}
Change the length of the erroneous line, and it compiles through.
offset is being given too many responsibilities in show_error_on_line.
utf8proc_iterate returns valuable error information which isn't being checked. If you have an Odin file with invalid Unicode in it, that also causes an infinite loop.
package infinite_loop_with_invalid_utf
main :: proc() {
a := <dc00>
}
(That can't be copied directly, you'll have to find another way to enter U+DC00.)
As some side notes about this problem in general, it looks like we're printing ^ regardless of the visual width of the first character of the text. There's still a note in show_error_on_line about how the calculations are for ASCII-only.
/tmp/odin/inf/main.odin(11:6) Error: Undeclared name: a
a := a
^
/tmp/odin/inf/main.odin(12:6) Error: Undeclared name: ば
b := ば
^~^
/tmp/odin/inf/main.odin(13:6) Error: Undeclared name: ç
c := ç
^^
/tmp/odin/inf/main.odin(14:6) Error: Undeclared name: du
d := du
^^
/tmp/odin/inf/main.odin(15:6) Error: Undeclared name: växjö
v := växjö
^~~~~^
I feel like rewriting this entire procedure is in order if the goal is to have proper padding, per the PR. For one, the usage of variable names off and offset was confusing initially, as well as rune_width which should actually be bytes_read - especially when we're checking the actual character width right adjacent.
All that aside, it was the height of cosmic comedy that I met this bug while working on a project to detect the visual width of a Unicode string.
Regression introduced by #3744. @Mango0x45
Change the length of the erroneous line, and it compiles through.
offset
is being given too many responsibilities inshow_error_on_line
.utf8proc_iterate
returns valuable error information which isn't being checked. If you have an Odin file with invalid Unicode in it, that also causes an infinite loop.(That can't be copied directly, you'll have to find another way to enter
U+DC00
.)As some side notes about this problem in general, it looks like we're printing
^
regardless of the visual width of the first character of the text. There's still a note inshow_error_on_line
about how the calculations are for ASCII-only.I would've expected CJK to be taken into account based on https://github.com/odin-lang/Odin/issues/3743#issuecomment-2164060865, and I was surprised to see
ç
treated as two glyphs. Even theväxjö
example given in the original issue is mis-aligned with an extra tilde.I feel like rewriting this entire procedure is in order if the goal is to have proper padding, per the PR. For one, the usage of variable names
off
andoffset
was confusing initially, as well asrune_width
which should actually bebytes_read
- especially when we're checking the actual character width right adjacent.All that aside, it was the height of cosmic comedy that I met this bug while working on a project to detect the visual width of a Unicode string.