For the JVM backend various Unicode related tests (e.g. in https://github.com/Raku/roast/) fail, because some opcodes for strings don't work on graphemes, but on Java's chars.
Examples:
$ ./rakudo-m -e 'my Str $u = "\x[0043,0323]"; say "$u -- chars: " ~ $u.chars'
C̣ -- chars: 1
$ ./rakudo-j -e 'my Str $u = "\x[0043,0323]"; say "$u -- chars: " ~ $u.chars'
C̣ -- chars: 2
$ ./rakudo-m -e 'my $str = join "", 0x10426.chr, 0x10427.chr; say $str.chars; say substr($str, 0, 1).uniname; say substr($str, 1, 1).uniname'
2
DESERET CAPITAL LETTER OI
DESERET CAPITAL LETTER EW
$ ./rakudo-j -e 'my $str = join "", 0x10426.chr, 0x10427.chr; say $str.chars; say substr($str, 0, 1).uniname; say substr($str, 1, 1).uniname'
4
<surrogate-D801>
<surrogate-DC26>
Please note that on the JVM, you currently get codepoints instead of graphemes.
I'm not sure if this can be solved without fully supporting NFG (https://github.com/Raku/nqp/issues/241).
But at least I want to use this issue as a reference for fudged tests.
For the JVM backend various Unicode related tests (e.g. in https://github.com/Raku/roast/) fail, because some opcodes for strings don't work on graphemes, but on Java's
chars
.Examples:
The problem is even mentioned in Rakudo's documentation on
routine chars
:I'm not sure if this can be solved without fully supporting NFG (https://github.com/Raku/nqp/issues/241). But at least I want to use this issue as a reference for fudged tests.