racket / drracket

DrRacket, IDE for Racket
http://www.racket-lang.org/
Other
445 stars 93 forks source link

Dr. Racket does not combine "Combining Macron Below" with previous character when rendering unicode #558

Open rxg opened 2 years ago

rxg commented 2 years ago

On macOS, in racket, if I run (bytes->string/utf-8 (bytes #x77 #xcc #xb1))

I get: "w̱" (w with an underline below it), but in Dr. Racket I get a w with an underline next to it, as though they were separate characters.

See: https://en.wikipedia.org/wiki/Macron_below

For context: I noticed this while representing the "Squamish" in its native orthography: (define B (bytes #x53 #xe1 #xb8 #xb5 #x77 #x78 #xcc

xb1 #x77 #xc3 #xba #x37 #x6d #x65 #x73 #x68))

(bytes->string/utf-8 B) "Sḵwx̱wú7mesh"

In Dr. Racket, the k is properly rendered with the underscore (to be fair this is a built-in unicode character)

But the x renders without the overlap.

Interestingly enough, running string->length on the above gives "12", but I'm guessing that this is the appropriate answer if "combining diacritical marks" should count toward length

rfindler commented 2 years ago

I think this has to do with the way the editor libraries draw text. It is possible to call in a way that the bit that's supposed to be under the "w" actually goes under it (that boolean passed to draw-text) but I don't know the ramifications of trying to change the editor library to use that drawing mode.

#lang racket/gui

(define str (bytes->string/utf-8 (bytes #x77 #xcc #xb1)))

(define (draw c dc)
  (send dc draw-text str 20 10 #f)
  (send dc draw-text str 20 40 #t))

(define f (new frame% [label ""] [width 400] [height 400]))
(define c (new canvas% [parent f] [paint-callback draw]))
(send f show #t)
rxg commented 2 years ago

Thanks Robby! Curious: if I put the full string into the above code, the second drawing (at 20,40) renders on the canvas differently than in my Dr. Racket interaction from the original post. On the canvas at 20,40 I see the second underline below the x that follows the w, whereas in my Dr. Racket interaction I see what looks like "w_x" with nothing above the underline. I don't understand how this interacts with fonts or font size so that may be what's happening if the default canvas font is not the same as what's in Dr. Racket (my size is definitely different).

rfindler commented 2 years ago

Oh, yeah, good point! I see something different there too and I'm not sure what's going on, actually. I've put a screenshot. Is that something similar to what you're seeing? (The definitions window contains the above code, so str is the same one that's being passed to draw-text.)

Screen Shot 2022-04-26 at 3 05 38 PM
rxg commented 2 years ago

Yes that's what I saw when I ran the code!

rfindler commented 2 years ago

Not sure it is helpful, but under linux it appears to draw only two ways (as I guess the difference is the font in this screenshot).

Screen Shot 2022-04-27 at 9 31 15 AM
rfindler commented 2 years ago

And here's windows, also looks like only two ways things get drawn.

Screen Shot 2022-04-27 at 9 57 41 AM
gus-massa commented 2 years ago

It looks like a hard problem. There are a few similar reports. I'm posting them in case they hav some hint:

[In the first and second previous reports, did the acent move in a different direction or I'm just misinterpreting them?]