Open djoooooe opened 3 years ago
Good find! Could be a bug in the transcoder (these were loose ports from the C code in Ruby) or a bad/old unicode table.
@lopex got a change to look at this? Maybe it's another one-character fix. 😀
I took a look into the CP50220 issue and read through the relevant functions (org.jcodings.transcode.TranscodeFunctions#funSoCp50220Encoder, org.jcodings.transcode.TranscodeFunctions#funSoCp5022xEncoder) and the data table used here (org.jcodings.transcode.TranscodeFunctions#tbl0208) and everything appears to match the C implementation.
Reduced case can use {0, 127, -114, -95, -114, -2}
because it blows up on the first -2. Running in Ruby you can use the following snippit of code:
"\x00\x7f\x8e\xa1\x8e\xfe\xa1\xa1\xa1\xfe".force_encoding("CP51932").encode("CP50220")
It blows up in JRuby and works in CRuby.
The following unit tests crash in
org.jcodings.transcode.Transcoding
: