mjakeman / text-engine

A lightweight rich text framework for GTK
Other
21 stars 1 forks source link

Formalise character positioning #17

Open mjakeman opened 2 years ago

mjakeman commented 2 years ago

aka 'The wonderful wild world of word wrapping' :/

Introduction

The way we currently handle cursor positions is a mess. For each paragraph, it is comprised of several 'runs', with each run being a block of contiguously formatted text. Text is indexed on a per-paragraph basis.

Character movement is handled by the TextEditor class, while home/end movement is handled by the TextDisplay class. This is because at present a TextDocument is a semantic description of the document, paired with a TextLayout to create the actual formatted document. TextEditor operates directly on the semantic document (as it should, at least for now).

Traversal between paragraphs is not a problem as each paragraph can be considered a 'self-contained' block, so going from the end of one paragraph to the start of another paragraph is one movement backwards/forwards.

The problem arising when dealing with paragraphs that span multiple lines, and particularly when words themselves are split instead of wrapped. When we use home/end, where should the cursor go?

Google Docs:

Paragraph wraps, word is not split:

The space is used to correctly handle traversal between the end of the first line and the start of the second. No special case needed.

https://user-images.githubusercontent.com/12368711/186341319-98a8d8b7-4e8d-4861-b68f-c62790496086.mov

Paragraph wraps, word is split:

Docs appears to use a 'before character' approach in that the caret position is determined by the following character. This is particularly nice because pressing end takes you to the final index position on the line, belonging to the final character on the line (i.e. the space or tab). For the final line in the paragraph however, there is no break and so we need to account for the extra index.

https://user-images.githubusercontent.com/12368711/186343400-78431c87-3280-49d5-a436-2b4a1dc2f063.mov

Text Engine:

Paragraph wraps, word is not split:

Normal navigation works, but the current state of home/end dumps the cursor on the line after. This is probably fixable by choosing the index preceding the final character on the line.

https://user-images.githubusercontent.com/12368711/186342110-83bfb1a6-5a1a-4651-bc9e-e5f23833d494.mov

Paragraph wraps, word is split:

Again this works similarly to Google Docs, however we have the additional '-' character inserted by Pango when word wrapping. We could probably disable this? The issue here which makes it look much worse than it is stems from 'end' using the character after, rather than the character before. This makes it appear like two characters are being skipped. Switching to a character before system puts us on par with Google Docs and is probably the best path forward:

https://user-images.githubusercontent.com/12368711/186342788-46513b95-1591-453c-892e-7d8952e6e086.mov

Again we need to count for the additional index on the final line, as there is no 'break' character.

Conclusions

We should:

mjakeman commented 2 years ago

Something else to consider is that we should differentiate between child indices (e.g. text run is "the second child of" paragraph) and character indices (Cursor is at index 7 in the string: "Hello W|orld").

This will be important when dealing with non-textual elements like Images, and even more so when implementing native equations.