ianstormtaylor / slate

A completely customizable framework for building rich text editors. (Currently in beta.)
http://slatejs.org
MIT License
29.62k stars 3.23k forks source link

Issue with Thai language on deleteBackward #4884

Closed tithanayut closed 2 years ago

tithanayut commented 2 years ago

Description A Thai word is typically composes from base consonant and vowels/tone marks above base consonant. Taking ปี่ for example, ป is base consonant and the other two are vowels/tone marks.

When deleting backward, vowel/tone mark should be deleted one by one. See the recording of Notepad on Windows for example.

https://user-images.githubusercontent.com/30551284/157884121-2c24f25b-0b56-4f4a-a041-14d211dbdc74.mp4

Slate, however, delete base consonant and vowels/tone marks above it altogether. See the recording below.

https://user-images.githubusercontent.com/30551284/157884550-5633baa6-cf77-4a70-9ead-a27c795d5199.mp4

Note that current Slate behavior is expected for deleteForward (where deleting base consonant should delete vowels/tone marks above it as well), but not for deleteBackward.

Sandbox The demo page of Slate www.slatejs.org/examples/richtext exhibits this problem in action.

Steps To reproduce the bug:

  1. Go to www.slatejs.org/examples/richtext
  2. Type in Thai words with above vowels/tone marks. For example, ปี่ปี่ปี่.
  3. Press backspace to deleteBackward.
  4. See the aforementioned bug.

Expectation When a Thai word is deleting from backward, vowel/tone mark should be deleted one by one. See the first recoding on Notepad for example.

Environment

tithanayut commented 2 years ago

It seems that AndroidEditable doesn't have this problem while DefaultEditable does.

https://github.com/ianstormtaylor/slate/blob/2a620dc20d808942dc0b1866cbdaf58418156c46/packages/slate-react/src/index.ts#L7-L7

tithanayut commented 2 years ago

I found that this issue has began since PR #4671.

It fixes the jump of cursor (for Thai language as well!). However, I believe the logic of moving cursor backward and deleting character is different. The former should ignore vowels/tone marks (currently working perfectly). The latter, however, should delete them one by one.

If I am not mistaken, it seems that both functions use getCharacterDistance to determine the number of characters to move/delete. This number is perfect for determining the distance to move cursor and deleteForward. Unfortunately, it is not for deleteBackward I believe. https://github.com/ianstormtaylor/slate/blob/e3afda946685795237f748e76c7bb051c09cb7fa/packages/slate/src/utils/string.ts#L15

tithanayut commented 2 years ago

According to reference [1] in string.ts,

This document defines a default specification for grapheme clusters. It may be customized for particular languages, operations, or other situations. For example, arrow key movement could be tailored by language, or could use knowledge specific to particular fonts to move in a more granular manner, in circumstances where it would be useful to edit individual components. This could apply, for example, to the complex editorial requirements for the Northern Thai script Tai Tham (Lanna). Similarly, editing a grapheme cluster element by element may be preferable in some circumstances. For example, on a given system the backspace key might delete by code point, while the delete key may delete an entire cluster.

"editing a grapheme cluster element by element may be preferable in some circumstances"

So this looks like a special case for deleteBackward to handle a few language like Thai.