feat(windows): core needs to preprocess U+000D U+000A to U+000A in context before any other processing

mcdurdin commented 7 months ago

see EDIT: I am anticipating that this will become a problem soon (encountered while working on the keyboard debugger).

I think this belongs under LDML keyboardprocessor. On Windows at least, \r\n should always be deleted as a block for K_BKSP.

We may need to special-case this, testing for the presence of this pair at the end of the context and requesting 2 back-deletions rather than 1.

(Note: I think this will become visible with the move from action queue to action struct in #10441, and become more obvious after #10415 is implemented.)

EDIT:

Principle -- Engine MUST preprocess context from compliant apps to convert \r\n to \n before supplying to Core, and then when emitting into compliant apps, do the inverse, \n to \r\n. Note the Keyman Developer debugger also needs to consider doing this.

See PR #10697 Which implements the proposed algorithm in the engine but we want it in the core.

We track what the engine gives the core input context, and give the same pattern back. This then covers all combinations of \r \n \r\n.

Test cases will be needed around buffer limits (if the algorithm causes truncation). Also testing the developer debugger vs the built and installed keyboard should be tested to make the developer debugger exhibits the same behaviour. Testing on each platform also (linux, windows macos)

mcdurdin commented 7 months ago

@rc-swag assigning to you but happy to discuss on who owns.

srl295 commented 7 months ago

Q: how does kmx handle this?

Is this going to be an issue for authors? that is, will keyboards see A\nB on some platforms and A\r\nB on others? dare I say, almost a normalization issue

rc-swag commented 6 months ago

I don't follow when will there be a need to backspace over a \r\n? Internally in the core the context will be invalidated on a "carriage return". On the platform side for Context-aware/Compliant apps the set_if_needed will only have the context up to the start of line.

mcdurdin commented 6 months ago

Note: see edit in OP for change in perspective post discussion with Ross. Keeping the same issue for now.

Principle -- preprocess context from compliant apps to convert \r\n to \n, and then when emitting into compliant apps, do the inverse, \n to \r\n.

mcdurdin commented 6 months ago

Q: how does kmx handle this?

Is this going to be an issue for authors? that is, will keyboards see A\nB on some platforms and A\r\nB on others? dare I say, almost a normalization issue

Yes we had the same discussion. Resolution is to normalize by Engine before passing into Core, given it applies mainly to Windows. Then keyboard authors will only ever see \n.

mcdurdin commented 6 months ago

Internally in the core the context will be invalidated on a "carriage return". On the platform side for Context-aware/Compliant apps the set_if_needed will only have the context up to the start of line.

The context may include a \n -- some apps start a new context on a new para, others treat the entire text buffer as a single unit.

mcdurdin commented 6 months ago

Note: we could add a hint to the keyboard compiler in the future to note that 0x0D will never be seen in context (noting that more work needs to be done on KMW for this to be the case).

keymanapp / keyman

feat(windows): core needs to preprocess U+000D U+000A to U+000A in context before any other processing #10471