DistributedProofreaders / guiguts-py

Guiguts rewrite using Python/tkinter
GNU General Public License v2.0
2 stars 7 forks source link

Astral plane characters not handled well by S/R dialog fields #284

Open windymilla opened 2 weeks ago

windymilla commented 2 weeks ago

Two bugs (at least)

  1. If you insert an astral plane character in the Search field, e.g. Ctrl-semicolon, then type x23456<Return>, then when you try to delete it, it does half the deleting at a time, so you need two presses of delete.
  2. If you search for astral plane character(s), it highlights 2 characters in the main window for every astral plane character in the search string, i.e. seems to think the astral plane characters are two characters long, which messes up the highlighting code.

Things to look up:

windymilla commented 2 weeks ago

One potential solution might be to replace the Entry fields with one line Text widgets.

Edited to add: although this would fix the need to hit delete twice to delete a character, it wouldn't fix the fact mentioned below that the matching character count from the Tk search is 2 per astral plane char.

windymilla commented 1 week ago

NTS: The character "count" returned by Text.search is 2 for a 2 code-unit character, like U+23456 The "column" goes up by 2 as well. There must be something special to make the delete key delete "both" characters.

General consensus is that current Tcl (8.6) doesn't handle astral plane characters perfectly. Also see here: https://www.tcl.tk/software/tcltk/9.0.html (currently in beta testing)

Highlight of Tcl 9.0: "full codepoint range"

windymilla commented 1 week ago

A further possible solution is one that might be needed anyway, which is to load the whole file into a Python string and do the search in there, rather than use Tcl/Tk Text.search method