alphapapa / ement.el

A Matrix client for GNU Emacs
GNU General Public License v3.0
476 stars 44 forks source link

Inconsistent Emoji formatting #186

Open krachynski opened 11 months ago

krachynski commented 11 months ago

While viewing a room I noticed one of the members only displays the first emoji in their name when they send a message to the room. image

This morning I managed to catch them submitting a reaction and when mousing over that, the full display name was rendered properly. image

Then I managed to catch them typing and it seems that when rendering in the buffer, unicode emoji are followed by super wide spacing which makes their name disappear on the left. image

Don't know how much of the following affects this but here goes: Windows 11 WSL 2 Debian Bullseye Basic Emacs install (apt-get install emacs)

So Emacs is running as a GUI app.

alphapapa commented 11 months ago

Hi Ken,

Thanks. What version of Emacs are you using? And have you followed the instructions in the readme about configuring Emacs's fontsets for emojis?

Finally, it would be helpful if you would copy and paste the user's displayname into the issue here. Maybe use ement-describe-room to find it in the list. There may be other hidden Unicode characters in it that are affecting spacing.

krachynski commented 11 months ago

Oh, sorry. I did follow the guidance on setting emoji fontsets.

This is GNU Emacs 28.2 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.37, cairo version 1.16.0)

The display name is ๐Ÿ‡จ๐Ÿ‡ฆ๐Ÿดโ€โ˜ ๏ธPoฯั—ล‹ั•ากi๐Ÿดโ€โ˜ ๏ธ๐Ÿ‡จ๐Ÿ‡ฆ-The-Mandalorian

alphapapa commented 11 months ago

Thanks. That's very strange. My Emacs seems to display the string correctly. I don't know where the extra spaces would be coming from. Maybe it's a bug in Emacs, but if so, you would need to find a way to reproduce the problem outside of Ement in order to file a bug report about it.

I see that those Unicode characters are composed ones, e.g. the Canadian flag is:

             position: 158 of 195 (81%), column: 12
            character: ๐Ÿ‡จ (displayed as ๐Ÿ‡จ) (codepoint 127464, #o370750, #x1f1e8)
              charset: unicode (Unicode (ISO10646))
code point in charset: 0x1F1E8
               script: emoji
               syntax: w    which means: word
             category: .:Base, L:Strong L2R
             to input: type "C-x 8 RET 1f1e8" or "C-x 8 RET REGIONAL INDICATOR SYMBOL LETTER C"
          buffer code: #xF0 #x9F #x87 #xA8
            file code: #xF0 #x9F #x87 #xA8 (encoded by coding system utf-8-unix)
              display: composed to form "๐Ÿ‡จ๐Ÿ‡ฆ" (see below)

Composed with the following character(s) "๐Ÿ‡ฆ" using this font:
  ftcrhb:-GOOG-Noto Color Emoji-normal-normal-normal-*-14-*-*-*-m-0-iso10646-1
by these glyphs:
  [0 1 127464 1504 17 0 18 13 4 nil]
with these character(s):
  ๐Ÿ‡ฆ (#x1f1e6) REGIONAL INDICATOR SYMBOL LETTER A

Character code properties: customize what to show
  name: REGIONAL INDICATOR SYMBOL LETTER C
  general-category: So (Symbol, Other)
  decomposition: (127464) ('๐Ÿ‡จ')

And the pirate flag is:

             position: 160 of 195 (82%), column: 14
            character: ๐Ÿด (displayed as ๐Ÿด) (codepoint 127988, #o371764, #x1f3f4)
              charset: unicode (Unicode (ISO10646))
code point in charset: 0x1F3F4
               script: emoji
               syntax: w    which means: word
             category: .:Base
             to input: type "C-x 8 RET 1f3f4" or "C-x 8 RET WAVING BLACK FLAG"
          buffer code: #xF0 #x9F #x8F #xB4
            file code: #xF0 #x9F #x8F #xB4 (encoded by coding system utf-8-unix)
              display: composed to form "๐Ÿดโ€โ˜ ๏ธ" (see below)

Composed with the following character(s) "โ€โ˜ ๏ธ" using this font:
  ftcrhb:-GOOG-Noto Color Emoji-normal-normal-normal-*-14-*-*-*-m-0-iso10646-1
by these glyphs:
  [0 3 127988 1819 17 0 18 13 4 nil]
with these character(s):
  โ€ (#x200d) ZERO WIDTH JOINER
  โ˜  (#x2620) SKULL AND CROSSBONES
  ๏ธ (#xfe0f) VARIATION SELECTOR-16

Character code properties: customize what to show
  name: WAVING BLACK FLAG
  general-category: So (Symbol, Other)
  decomposition: (127988) ('๐Ÿด')

Emacs 29 has a fix related to composed characters, IIRC, so I wonder if it would be fixed in Emacs 29. If you can, please download the latest Emacs 29 pretest and see if the problem persists in it. See: