So, as we have learned, a Unicode character can be made of multiple bytes, but it can also be made of multiple other Unicode characters. And they can be quite large โ 35 bytes, in the earlier example.
package main
import (
"fmt"
"reflect"
)
func main() {
fmt.Println("๐ is this many runes:", fmt.Sprintf("%08b", '๐'), "printed as strings:", runesAsStrings([]rune("๐")))
fmt.Println("๐ฉ๐พโโค๏ธโ๐โ๐ฉ๐ป is this many runes:", []rune("๐ฉ๐พโโค๏ธโ๐โ๐ฉ๐ป"), "printed as strings:", runesAsStrings([]rune("๐ฉ๐พโโค๏ธโ๐โ๐ฉ๐ป")))
fmt.Println("๐ฉ๐ฟ is this many runes:", []rune("๐ฉ๐ฟ"), "printed as strings:", runesAsStrings([]rune("๐ฉ๐ฟ")))
fmt.Println("๐ฉโ๐๏ธ is this many runes:", []rune("๐ฉโ๐๏ธ"), "printed as strings:", runesAsStrings([]rune("๐ฉโ๐๏ธ")))
fmt.Println("๐ฉ๐พโโค๏ธโ๐โ๐ฉ๐ป is this many runes:", []rune("๐ฉ๐พโโค๏ธโ๐โ๐ฉ๐ป"), "printed as strings:", runesAsStrings([]rune("๐ฉ๐พโโค๏ธโ๐โ๐ฉ๐ป")))
// Creating a rune
rune1 := 'B'
rune2 := 'g'
rune3 := '\a'
// Displaying rune and its type
fmt.Printf("Rune 1: %c; %08b Unicode: %U; Type: %s\n", rune1, rune1, rune1, reflect.TypeOf(rune1))
fmt.Printf("Rune 2: %c; %08b Unicode: %U; Type: %s\n", rune2, rune2, rune2, reflect.TypeOf(rune2))
fmt.Printf("Rune 3: %c; %08b Unicode: %U; Type: %s\n", rune3, rune3, rune3, reflect.TypeOf(rune3))
}
func runesAsStrings(runes []rune) (s string) {
for _, r := range runes {
s += string(r)
}
return
}
That's why it's called a rune (a code point), and not a grapheme cluster ;)
coding rules
UTF-8 Encoding
Bear plus snowflake equals polar bear
https://andysalerno.com/posts/weird-emojis/#
๐ฉ๐พ + โค + ๐ + ๐ฉ๐ป =
๐ป (bear; U+1F43B) + โ (snowflake; U+2744) \= ๏ธ๏ธ(polar bear; U+1F43B U+200D U+2744 U+FE0F)
That's why it's called a rune (a code point), and not a grapheme cluster ;)
่ฟๅฐฑๆฏไธบไปไนๅฎ่ขซ็งฐไธบ็ฌฆๆ(ไธไธชไปฃ็ ็น) ๏ผ่ไธๆฏๅญ็ด ้็พค;)
https://www.reddit.com/r/golang/comments/o1o5hr/fyi_a_single_go_rune_is_not_the_same_as_a_single