Closed tsahara closed 10 years ago
Regexp may or may not support \u
according to engines. In fact, escape sequence varies from engine to engine. So I thought all escaping should be done better by regexp parser provided by engines. For example, as far as I checked Oniguruma, it supports to compile \u
sequences.
But this issue report may indicate problems. How much do you think mruby parser should convert backslash escape sequences independently from regexp engine?
For example, you don't want /foo\u42/
to be /foo*/
and match with "foooo", do you?
No, I don't :) (though it can be safely converted to \x2a
for some regular expression library... hmm, you mean \u002a
?). While mruby does not define syntax of regular expression literal except it is enclosed by /
, it sounds reasonable to me that mruby parser does not convert any backslash escape sequences.
In that case, I'd make it a spec of mruby regexp and close this issue.
Thank you for clarification.
ok. I found solve:
"Żółta żaba żarła żur".split("")
Why my program dosnt work in mruby? "Żółta żaba żarła żur".scan(/./m).each {|a| print a}
"doesn't work"
doesn't help to solve your problem.
You have to tell us what you expected, what you got and info about your platform.
What regexp gem did you use?
Before https://github.com/mruby/mruby/commit/5f2817b36c32ff71031c514b2fdf51ba6b74d83c , unicode escape characters in regular expression literal are parsed by mruby parser and converted to utf-8 byte sequences. Regexp.compile expects that:
But after the commit, they are not converted:
Is this an intentional change?