Open samshutchins opened 1 year ago
Current jsoup: html > body > img.e\́
Chrome: body > p.e\\u0301
I don't think it's incorrect to emit it as a run of characters. And the selector does work in jsoup. We could improve to escape the combining form as a \u escape character, like Chrome is.
The example above uses combining characters to create an
é
. Emoji make heavy use of combining characters (👨👨👧👧 is made up of 11 characters:\uD83D\uDC68\u200D\uD83D\uDC68\u200D\uD83D\uDC67\u200D\uD83D\uDC67
).I have seen emoji used as css class names in the wild, and I think the character escaping code is doing the wrong thing when calling
cssSelector
, it looks like it's escaping every character individually, which breaks things with these combining characters.