In unicode, graphemes might be represented by a sequence of several codepoints. For example, the emoji 🫶 is two codepoints: \ud83e\udef6.
Should the length of a string in JME count graphemes or codepoints? I think the least-surprising answer from a human's perspective is graphemes, but that means that all the methods for indexing and slicing strings need to be grapheme-aware.
In unicode, graphemes might be represented by a sequence of several codepoints. For example, the emoji 🫶 is two codepoints:
\ud83e\udef6
.Should the length of a string in JME count graphemes or codepoints? I think the least-surprising answer from a human's perspective is graphemes, but that means that all the methods for indexing and slicing strings need to be grapheme-aware.