rougier / nano-modeline

GNU Emacs / N Λ N O Modeline
GNU General Public License v3.0
170 stars 29 forks source link

Multibyte characters align error in modeline #72

Open TomoeMami opened 7 months ago

TomoeMami commented 7 months ago

When the date string contains multibyte characters such as CJK character, the modeline is wrongly aligned.

image

I found the function nano-modeline--make uses length to calculate align position:

https://github.com/rougier/nano-modeline/blob/2b0f03205c5c818a3a0622fd39779bec43dd5869/nano-modeline.el#L308C55-L308C73

It's ok for singlebyte characters, but for multibyte characters, length only calculated the number of characters.

image

If the string contains multibyte characters, this is not necessarily the number of bytes in the string; it is the number of characters. To get the number of bytes, use ‘string-bytes’.

After I replace length with string-bytes in the nano-modeline.el, the time string shows correctly:

image

TomoeMami commented 7 months ago
Welcome to the Emacs shell

~/.emacs.d $ (length "周二 6 二月 2024")
12
~/.emacs.d $ (string-bytes "周二 6 二月 2024")
20
~/.emacs.d $ (length "Tue 6 Feb 2024")
14
~/.emacs.d $ (string-bytes "Tue 6 Feb 2024")
14

string-bytes returns the same result as length when dealing with singlebyte characters.

image

aaronjensen commented 7 months ago

That doesn't look right either. The time should be aligned against the right edge, but in your screenshot there is additional space, which is exactly what I would expect if one were using byte length rather than character width length.

I think you may want string-width, per https://github.com/skeeto/skeeto.github.com/blob/master/_posts/2014-06-13-Emacs-Unicode-Pitfalls.markdown#string-width

No guarantees that that works either. The docs say:

The effect of faces and fonts, including fonts used for non-Latin and
other unusual characters, such as emoji, is ignored, as are display
properties and invisible text.
TomoeMami commented 7 months ago

image

This is what I get with string-width. Maybe string-width is a better solution.

aaronjensen commented 7 months ago

That looks better, but still not right. Unfortunately, Emacs may not be able to get it right. Likely string-width won't cause problems for latin chars, so it may be safe.

TomoeMami commented 7 months ago

image

Set font to Sarasa Term SC or other mono font can help align

rougier commented 6 months ago

You can also use string-pixel-width to measure the actual size.

TomoeMami commented 6 months ago

image Not so good. For me, a Windows Emacs user with CJK characters, it seems that string-width > string-bytes > length > string-pixel-width.

rougier commented 6 months ago

Weird since string-pixel-width is measure in pixels and should (very) approximately equal to (* string-width (frame-char-width)) Can you post the different measures for a given string (and possibly post the string such that I can test). Note that if you use string-pixel-width, nano-modeline will need to be adapted here

Probably something like (,(string-pixel-width right)) (extra parens mean pixel)

rougier commented 6 months ago

I go 120 from (string-pixel-width "周二 6 二月 2024")

TomoeMami commented 6 months ago

image

TomoeMami commented 6 months ago

I set font-size to 32, will this effect string-pixel-width ?

rougier commented 6 months ago

The documentation of string-width reads:

For these reasons, the results are just an approximation, especially
on GUI frames; for accurate dimensions of text as it will be
displayed, use string-pixel-width or window-text-pixel-size
instead.

so I think it is the best measure we can get.

TomoeMami commented 6 months ago

Yes, but string-pixel-width is affected by the font size.

image

this is the set-font function:

(defun set-font (english chinese english-size chinese-size)
  (set-face-attribute 'default nil :font
                      ;; (format   "%s:pixelsize=%d"  english english-size) :weight 'semi-bold
                      (font-spec :family english :size english-size))
  (set-face-attribute 'fixed-pitch nil :font
                      ;; (format   "%s:pixelsize=%d"  english english-size) :weight 'semi-bold
                      (font-spec :family "Cascadia Code" :size english-size))
  (dolist (charset '(kana han symbol cjk-misc bopomofo))
    (set-fontset-font (frame-parameter nil 'font) charset
                      (font-spec :family chinese :size chinese-size))))
rougier commented 6 months ago

But we need to take fontsize into account when positioning the text, no?

TomoeMami commented 6 months ago

Another less-diverse approach: (setq system-time-locale "C") and time-format will show in English.

image

rougier commented 6 months ago

Not sure to get your point here.