Open sorawee opened 3 years ago
I am guessing this is an issue with either text%
or perhaps the drawing libraries (accessed via, eg, canvas-dc%
), but maybe on a non-mac platform? Or maybe a specific font? (It looks okay to me.)
Here's some code that might reproduce the issue outside of DrRacket (if it isn't a font-specific issue).
#lang racket/gui
(define s "กำหนด ความกว้าง")
(define t (new text%))
(define f (new frame% [label ""][width 300] [height 300]))
(define ec (new editor-canvas% [parent f] [editor t]))
(send t insert s)
(send f show #t)
Sorry, should have mentioned that I'm on Mac. The program that you provided above does reproduce the issue, though weirdly, "กำ" is now displayed correctly! "กว้าง" is still incorrect however.
This is not a font specific issue IIUC. Even with the font TH Sarabun New (the standard font for Thai script), the issue persists in DrRacket.
Here's how it displays in word processor softwares.
I think the problem is more generally with unicode combining characters:
#lang racket/base
(define chars '(#\e #\u0301))
(displayln chars)
(displayln (list->string chars))
(newline)
(define precomposed-chars
((compose string->list string-normalize-nfc list->string)
chars))
(displayln precomposed-chars)
(displayln (list->string precomposed-chars))
Related? https://github.com/racket/draw/issues/22
According to a comment in this issue, DrRacket always uses #f
for the combine?
parameter to the draw-text
method of dc<%>
. And the code has this comment: https://github.com/racket/draw/blob/a4e156abe5119309783443495d671b9a7f3e434b/draw-lib/racket/draw/private/dc.rkt#L1493
In the latest version of DrRacket, things are a bit flipped. Running @rfindler's program, we will get:
where กำ
, which consists of two characters ก
and ำ
, is displayed without the circle on top of ก
. Note though that กว้าง
is now displayed correctly.
It's somewhat weird, because this display problem only occurs when I choose not to "normalize" when pasting the code in. If I normalized, I do get the desired display, but now กำ
becomes 3 characters: ก
, ํ
, and า
, which is incorrect in Thai language. ำ
is one character, and is not equivalent to ํ
+ า
.
I want to try this again after the recent unicode change, and just noticed a couple more issues (which already exist even before the unicode change)
Steps to reproduce:
(ความกว้าง 500)
to DrRacket. Notice that the number 500
is not syntax-highlighted correctly The problem with (ความกว้าง 500)
should be fixed by the snip-lib
commit.
DrRacket can't display diacritics in Thai language (and probably other languages with diacritics) correctly in the code editor.
Here's how it should be displayed:
(กำหนด ความกว้าง 500)
FWIW, Emacs is able to display it correctly.
@mbutterick's quad used to have an issue with diacritics too (though it's a different problem), so let me @ you in case you have an idea what could go wrong.