johnnovak / illwill

A curses inspired simple cross-platform console library for Nim
Do What The F*ck You Want To Public License
398 stars 27 forks source link

Displaying Multibyte Characters #32

Open Comamoca opened 2 years ago

Comamoca commented 2 years ago

Hello! I am creating a TUI application that displays Japanese using illwill. However, I am having a problem with displaying Japanese characters with illwill, and I am wondering what to do about it.

I would appreciate it if you could tell me about the above two questions. Thank you!

johnnovak commented 2 years ago

What platform are you using? On Windows, multibyte character support is problematic in the standard command prompt. It should work a lot better on Linux and MacOS.

Internally, illwill uses UTF-8 codepoints for everything, so things should just work. However, I haven't tested it specifically with languages that use multibyte characters heavily.

I'm suspecting that what you're experiencing is more like a console limitation.

Answering your questions:

  1. I don't know, I have no idea how Japanese characters work.
  2. It should already handle multibyte characters, need more info on what problems you're experiencing and what setup you have. Posting a test case here would help.
Comamoca commented 2 years ago

Thanks for your reply!

I using platform Linux(Manjaro).

I will post the case together with the case in which the problem occurred. This is a display corruption that occurs when executing the code described below on Wezterm. I have tried running the same code on other terminal such as XfceTerminal, LXTerminal, Kitty, etc. and they all output the same results.

Screenshot scshot_2022-05-28_04-29-20

Source Code

import illwill
import os
import json
import strformat
import strutils

# ------------------------------------
# Process to prepare content
# ------------------------------------

# Draw the preview screen on the left side of the screen
proc previewArea(tb: var TerminalBuffer, content: string) =
  let width = toInt(tb.width / 3)

  tb.setForegroundColor(fgYellow)
  tb.drawRect(tb.width - width - 1, 0, tb.width-1, tb.height-1)
  tb.write(tb.width - width, 1, content)

# Draw the selection menu on the right side of the screen
proc selectArea(tb: var TerminalBuffer, y: int, pos: int, datas: seq[Results]) =
  for i, data in datas:
    if i == pos:
      tb.setForegroundColor(fgBlack, true)
      tb.setBackgroundColor(bgGreen)
      tb.write(2, y+i+1, data.title)
    else:
      tb.write(2, y+i+1, data.title)
    tb.resetAttributes()

proc exitProc() {.noconv.} =
  illwillDeinit()
  showCursor()
  quit(0)

illwillInit(fullscreen = true)
setControlCHook(exitProc)
hideCursor()

# cursor position
var pos: int

while true:
  var tb = newTerminalBuffer(terminalWidth(), terminalHeight())
  var key = getKey()

  tb.selectArea(0, pos, results)
  tb.previewArea(results[pos].content)

  case key
  of Key.J:
    pos = pos + 1
    if results.len <= pos:
      pos = 0
  of Key.K:
    pos = pos - 1
    if pos < 0:
      pos = results.len - 1
      continue
  of Key.Escape, Key.Q: exitProc()
  of Key.Enter:
    exitProc()
    echo pos
  else: discard

  tb.display()
  sleep(20)
johnnovak commented 2 years ago

So judging by your screenshot, I think what's happening is that the Japanese characters physically seem to take up the width of two Latin characters, but they're encoded as multiple UTF-8 code points, most likely not two codepoints, but perhaps more?

Like I said, I don't speak Japanese and know very little about the Japanese language, the symbols, and how the symbols are encoded on computers. I just found this page and there seems to be a lot of complexity regarding Japanese encodings:

https://www.sljfaq.org/afaq/encodings.html

I'm somewhat interested in getting to the bottom of this as it might affect not just Japanese but other non-Latin languages as well. But you'll need to provide a program that I can compile and execute — the above program you posted does not compile; I'm guessing it's a part of a larger program, and it doesn't output any Japanese characters, so it doesn't allow me to reproduce the issue visually on my computer.

Please don't assume anything, and provide all the following:

  1. What character encoding are you using in your terminal (is it UTF-8? is it something else that you need to use to display Japanese characters correctly?)
  2. A self-contained Nim program that I can compile and run that outputs some lines of text in Japanese that demonstrates the alignment problem.
  3. If you put the same few lines of text into a textfile, and you display it in the console with echo in the same terminal, or you open the textfile in Vim (in the same terminal), do the alignment issues still happen? A textfile like that would be very useful for testing.
Comamoca commented 2 years ago

The encoding used in the terminal is UTF-8. In addition, Japanese fonts must be installed to correctly display Japanese characters on the terminal. The font used on my terminal is UDEV-Gothic, but if you are using a Linux distribution or similar, noto-fonts-cjk is the easiest to install. There is probably no difference in display between the different fonts.

This is a program that reads a file named test.txt in the current directory and displays it on the right side of the screen and a rectangle that imitates a preview window on the left side of the screen. Please place the attached text file in the same directory and run it.

2022-05-30_02-47-15

Also, the screenshot here is a shot of these files opened in Vim. I have not encountered any problems displaying Japanese. 2022-05-30_02-49-36

Programs

import illwill
import os
import strutils

proc loadTxtFile(path: string): seq[string] =
  block:
    var f : File = open("test.txt" , FileMode.fmRead)
    defer :
      close(f)
      echo "closed"
    return f.readAll().split("\n")
    #echo f.readLine()

proc drawText(tb: var TerminalBuffer, texts: seq[string]) =
  for idx, text in texts:
    tb.write(0, terminalHeight()-idx, text)

proc exitProc() {.noconv.} =
  illwillDeinit()
  showCursor()
  quit(0)

illwillInit(fullscreen = true)
setControlCHook(exitProc)
hideCursor()

var pos: int

var texts = loadTxtFile("test.txt")
while true:
  var tb = newTerminalBuffer(terminalWidth(), terminalHeight())
  var key = getKey()

  tb.drawText(texts)
  tb.drawRect(terminalWidth() div 2, 0, terminalWidth(), terminalHeight())

  case key
  of Key.Escape, Key.Q: exitProc()
  of Key.Enter:
    exitProc()
    echo pos
  else: discard

  tb.display()
  sleep(20)

Text file used for loading. (Save the file as test.txt.)

いろはにほへと ちりぬるを
わかよたれそ  つねならむ
うゐのおくやま けふこえて
あさきゆめみし ゑひもせすん

色は匂へど 散りぬるを
我が世誰そ 常ならむ
有為の奥山 今日越えて
浅き夢見じ 酔ひもせず
johnnovak commented 2 years ago

Cheers, I'm busy with other stuff now, I'll have a look at this at some point.

forthlee commented 1 year ago

Try to modify displayFull() in illwill.nim .

proc displayFull(tb: TerminalBuffer) =
  let widthTable = [
    (126,  1), (159,  0), (687,   1), (710,  0), (711,  1),
    (727,  0), (733,  1), (879,   0), (1154, 1), (1161, 0),
    (4347,  1), (4447,  2), (7467,  1), (7521, 0), (8369, 1),
    (8426,  0), (9000,  1), (9002,  2), (11021, 1), (12350, 2),
    (12351, 1), (12438, 2), (12442,  0), (19893, 2), (19967, 1),
    (55203, 2), (63743, 1), (64106,  2), (65039, 1), (65059, 0),
    (65131, 2), (65279, 1), (65376,  2), (65500, 1), (65510, 2),
    (120831, 1), (262141, 2), (1114109, 1)
  ]

  proc getWidth(c: int): int =
    if c == 0xe or c == 0xf:
      return 0
    for (num, wid) in widthTable:
      if c <= num:
        return wid
    return 1

  var buf = ""
  var skipNo = 0

  proc flushBuf() =    
    if buf.len > 0:
      put buf
      buf = ""

  for y in 0..<tb.height:
    setPos(0, y)
    for x in 0..<tb.width:
      let c = tb[x,y]
      if c.bg != gCurrBg or c.fg != gCurrFg or c.style != gCurrStyle:
        flushBuf()        
        setAttribs(c)
      var cc = $c.ch
      skipNo += (getWidth(cc.runeAt(0).int) - 1)
      if cc == " " and skipNo>0:
        cc = ""
        skipNo -= 1
      buf &= cc 

    flushBuf()
    skipNo = 0
Comamoca commented 1 year ago

I tried that way, I was looked at expected behavior. Thank you!

image

Comamoca commented 1 year ago

@forthlee I am thinking of using this code to create a Japanese (and also Chinese, Korean, etc.) fork of illwill. Would it be ok to include the code you provided in the fork? Thank you.

forthlee commented 1 year ago

The code is referenced from Urwid urwid is licensed under the GNU Lesser General Public License v2.1

johnnovak commented 1 year ago

@forthlee I am thinking of using this code to create a Japanese (and also Chinese, Korean, etc.) fork of illwill. Would it be ok to include the code you provided in the fork? Thank you.

Feel free to raise a PR if you think Japanese/Asian languages can be supported unobtrusively.

Comamoca commented 1 year ago

@forthlee

thanks! I have spring break so, I want to work on this issue.

Comamoca commented 1 year ago

@johnnovak

I will consider sending PR when I have reached some degree. Please take care of me then.😉

johnnovak commented 1 year ago

@johnnovak

I will consider sending PR when I have reached some degree. Please take care of me then.😉

Sure thing. Good luck! 😎