Closed RadhiFadlillah closed 1 year ago
Nevermind. Was stupid, found #237, got smarter.
For other people who stumbled with same issue, change your grammar config like this:
g[ü]ncellen?me {
fmt.Println("MATCH FOUND")
return
}
Then run re2go with --input-encoding utf8
flag:
re2go --input-encoding utf8 -u --flex-syntax -i input.re -o main.go
Right, for UTF-8 encoded source code, use --input-encoding utf8
.
For UTF-8 encoded input, use --utf8
/ re2c:encoding:utf8 = 1;
(you used -u
which is not UTF-8, but UTF-32, which means that your lexer is generated for UTF-32 encoded input). At some point re2c had only short options -u
, -8
and so on, but now it has less confusing aliases --utf32
, --utf8
, etc., as well as configurations for these options.
Flex syntax support is somewhat rudimentary in re2c, e.g. in this case it should have worked with güncellen?me
but the more recently added --input-encoding
option did not play well with --flex-support
(here in the source code re2c consumes one byte at a time, disregarding the possibility of multibyte characters). This is actually a bug, so I'm reopening this issue to fix it.
I'm glad that g[ü]ncellen?me
worked out. Alternatively you can just use re2c-native syntax "güncelle" "n"? "me"
which is fully supported, and avoid any further potential issues with flex-like syntax.
@RadhiFadlillah Also not that for cur < len(str)
is not a correct way of handling the end of input, you can replace it with just for
. It is the sentinel rule [\000] { return }
that stops the lexer. More info here: http://re2c.org/manual/manual_go.html#handling-the-end-of-input.
Here's a fix: https://github.com/skvadrik/re2c/commit/cbd52e0b8eea3687ada56b70292a7af79af1fb5c, it will be merged into master once it passes all the CI checks.
I'll close this bug, please reopen if you have any further issues.
I'm trying to generate Go code for regular expression
güncellen?me
. I made my re2go input file like this:Then I run re2go like this:
Unfortunately it always fail with following message:
Any tips on how to solve this issue? Thanks!