Simple, fast, safe, compiled language for developing maintainable software. Compiles itself in <1s with zero library dependencies. Supports automatic C => V translation. https://vlang.io
MIT License
35.86k
stars
2.17k
forks
source link
Support splitting Strings into Unicode Grapheme Cluster #22117
When working with Unicode, we usually don't care about the bytes, but we usually also don't care about the code points (runes). What we mostly care is characters displayed on screen (grapheme clusters). Unicode provides an algorithm to split strings into grapheme clusters (units of display width one). This feature is about including grapheme cluster splitting into builtin.
Use Case
Anyone working with a UI, who wants to know:
how long is a string (display characters on the screen)
where is the pointer on screen
Neither bytes nor runes provide this information
use format strings with unicode strings
Example:
This text should be right aligned:
examples := [
'\u006E\u0303',
'\U0001F3F3\uFE0F\u200D\U0001F308',
'ห์',
'ปีเตอร์'
]
println("0123456789abcdefgh")
for text in examples
{
println("${text:10}")
}
Describe the feature
When working with Unicode, we usually don't care about the bytes, but we usually also don't care about the code points (runes). What we mostly care is characters displayed on screen (grapheme clusters). Unicode provides an algorithm to split strings into grapheme clusters (units of display width one). This feature is about including grapheme cluster splitting into builtin.
Use Case
Anyone working with a UI, who wants to know:
Example:
This text should be right aligned:
But it isn't.
Proposed Solution
Add a feature to split a string into graphemes
Current Behavior
Proposed behavior:
Further suggestions
e.g.
Other Information
Unicode Reference and some more info on the background
This feature would also fix this bug:
Acknowledgements
Version used
0.4.7
Environment details (OS name and version, etc.)