Closed scabug closed 13 years ago
Imported From: https://issues.scala-lang.org/browse/SI-2565?orig=1 Reporter: Florian Hars (florian)
@adriaanm said: Works in 2.8:
Welcome to Scala version 2.8.0.r0-b20091109091326 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_15).
Type in expressions to have them evaluated.
Type :help for more information.
scala> val out = new java.io.PrintStream(System.out, true, "UTF-8")
out: java.io.PrintStream = java.io.PrintStream@b4ef239
scala> val miserable = "Les Mis?rables"
miserable: java.lang.String = Les Mis?rables
scala> out.println(miserable.reverse)
selbar�siM seL
(Inspired by http://www.macosxhints.com/article.php?story=20050208053951714)
I opened #2596 to deal with Console.println/the interpreter not printing utf correctly.
@dragos said: I'll leave this closed, but note that the issue is still there. The original ticket used unicode combining chars, an e followed by an accute sign, not an accented e, like in �. Here's the full test with output:
object Test extends Application {
val out = new java.io.PrintStream(System.out, true, "UTF-8")
val xs = "Les Mise\u0301rables"
val ys = "Les Mis�rables"
out.println(xs)
out.println(xs.reverse)
out.println(ys)
out.println(ys.reverse)
}
giving:
Les Mis�rables
selbaŕesiM seL
Les Mis�rables
selbar�siM seL
@paulp said: The issue as I interpret it is outside plausible library scope in the near term. We don't have a String reversing function, we have a sequence reversing function, and reverse does that nicely. Reversing a sequence usually reverses the String successfully, and where it doesn't we'll need someone to take a serious interest in encoding issues. It's not trivial.
@adriaanm said: Replying to [comment:2 dragos]:
I'll leave this closed, but note that the issue is still there. The original ticket used unicode combining chars, an e followed by an accute sign, not an accented e, like in �. I stand corrected.
Florian Hars (florian) said: Replying to [comment:3 extempore]:
We don't have a String reversing function, we have a sequence reversing function
But then you should rename RichString
to RichCharSequence
:-).
If a method is called RichString.reverse
, the natural thing is to expect it to produce something that is the reverse of the string. You have to either swap combiners with their corresponding base characters before or after the reverse, or normalize the string before reversing (java.text.Normalizer, but see http://diveintomark.org/archives/2004/07/06/nfc).
scala> import java.text.Normalizer._
import java.text.Normalizer._
scala> val s = "souf\ufb02e\u0301s"
s: java.lang.String = soufflés
scala> s.reverse
res6: scala.runtime.RichString = śeflfuos
scala> normalize(s,Form.NFC).reverse
res7: scala.runtime.RichString = s�flfuos
scala> normalize(s,Form.NFKC).reverse
res8: scala.runtime.RichString = s�lffuos
Reopening, this should either be fixed, or resolved as wontfix with a good reason.
@adriaanm said: sorry, we can't fix this because we committed to java.lang.string
As seen in [http://msmvps.com/blogs/jon_skeet/archive/2009/11/02/omg-ponies-aka-humanity-epic-fail.aspx OMG Ponies]: