Closed ImplOfAnImpl closed 1 year ago
Rerun the generation with -W
option (it enables re2c warnings): you'll see that re2c complains about undefined control flow:
warning: control flow in condition 'str' is undefined for strings that match
'[\x0\x80-\xC1\xF5-\xFF]'
'[\xC2-\xDF] [\x0-\x7F\xC0-\xFF]'
'\xE0 [\x0-\x9F\xC0-\xFF]'
'[\xE1-\xEF] [\x0-\x7F\xC0-\xFF]'
'\xF0 [\x0-\x8F\xC0-\xFF]'
'[\xF1-\xF3] [\x0-\x7F\xC0-\xFF]'
'\xF4 [\x0-\x7F\x90-\xFF]'
'\xE0 [\xA0-\xBF] [\x0-\x7F\xC0-\xFF]'
'[\xE1-\xEF] [\x80-\xBF] [\x0-\x7F\xC0-\xFF]'
'\xF0 [\x90-\xBF] [\x0-\x7F\xC0-\xFF]'
'[\xF1-\xF3] [\x80-\xBF] [\x0-\x7F\xC0-\xFF]'
'\xF4 [\x80-\x8F] [\x0-\x7F\xC0-\xFF]'
'\xF0 [\x90-\xBF] [\x80-\xBF] [\x0-\x7F\xC0-\xFF]'
'[\xF1-\xF3] [\x80-\xBF] [\x80-\xBF] [\x0-\x7F\xC0-\xFF]'
'\xF4 [\x80-\x8F] [\x80-\xBF] [\x0-\x7F\xC0-\xFF]'
, use default rule '*' [-Wundefined-control-flow]
This means that not all possible code patch are covered by your rules: if the input happens to satisfy one of the above patterns, control flow in your program is undefined. What you need is to define the default rule *
in every condition. See here for details: https://re2c.org/manual/warnings/warnings.html#wundefined-control-flow
Yeah, my bad. The code is actually broken in the non-utf8 mode too. Thanks for the prompt reply!
Should this be closed?
Hi, I may be missing something but I have code that works with "re2c:encoding:utf8 = 0" and fails with "re2c:encoding:utf8 = 1". Below are the rules, the full code is here:
The input is
'aa
. With "utf-8" off, it works just fine and I get "Type = 1" which is TokenType::UnterminatedStr.And with "utf-8" on, the code loops indefinitely:
It looks like zero code point is not handled in " [^'\x00]" in this case. Or am I doing something wrong?