Closed haku closed 8 years ago
Too add to the fun...
public class A {
public static void main(String[] args) {
String a = "πͺπΌπͺπΌπͺπΌ";
System.out.println(a);
System.out.println(a.length());
String b = "γ§γ";
System.out.println(b);
System.out.println(b.length());
}
}
On my desktop (java version "1.7.0_79"
) outputs:
$ java A
πͺπΌπͺπΌπͺπΌ
12
γ§γ
2
Which is going to make writing a JUnit test for this so much fun.
May be related: http://stackoverflow.com/a/969200/332868
How about just gsub'ing the t.co URL? (Thanks @rvedotrc)
Possible occurrence/variation of this issue in this tweet which renders correctly on Web and iOS like this but on Onosendai looks like this.
Trying find and replace approach instead of using offset indexes. And while messing with the same code, made it remote image URLs like Twitter does.
Seems to be fixed ok. Closing.
This tweet https://twitter.com/tintinfp/status/654805685645873153 contains the string:
Which in JSON looks like:
It should be decoded as 6 characters:
But then rendered as 3.
Even Twitter's website, which clears knows about skin tone modifiers, renders this wrong, with the modifier as its own block instead of altering the preceding character.
The API reports the character offset of the URL as if each
\ud83c\udffc
does not exist, i.e. this is 3 characters long:OS / Android does not know about skin tone modifiers, and does not even seem to know these are unicode characters at all, and thus counts each
\ud83c\udffc
as two characters and positioning the URL 6 characters too far to the left and splatting the preceding text.