Closed vadi2 closed 3 years ago
Source code sample
https://github.com/microsoft/vscode/pull/136347/commits/d2c24cc1d1161193c46ac44364a053c082657d69
Finding these character isn't very hard. I to figure out how to renderer it. The render pipeline uses QTextLayout to render this stuff, I don't know if I can render several characters as one character.
Btw I tried the 'bad' code in Edbee.. The cursor behaves very strange when move past this code. (It move correctly from right to left)
Sample bad code:
// from, to, amount
transferBalance(5678,6776,4321,"USD");
Work in progress...
This look good, but isn't working. The caret positioning doesn't respect the [U+202e] texts as single character. I don't know if this can be resolved easily ..
Does it treat it as multiple characters?
Yes, that's the problem for now.. It is treated as multiple characters, but editing works with the original characters, so it behaves very strange.
An alternative for now, could be showing a single error character... (Unfortunately you don't see the unicode this way)
I think that is a fine alternative 👍
@vadi2 Can I enable this feature by default? (I think you don't want to disable this)
Note to self, for VSCODE style rendering:
At the moment the x-position of the caret is calculated with a line from QTextLayout
and QTextLine::cursorToX(column)
.
It is only used in source files: It TextRenderer.cpp
and TextEditorRenderer.cpp
Because the QTextLine contains the full text [U+XXXX] it will calculate the wrong x-position. Perhaps it is possible to override the QTextLine to offset column location supplied when calculating the x.
Other note to self:
Research QTextLayout.setPreeditArea? Is it usable to add pieces text that the cursor skips?
I agree, it should be enabled by default.
Btw the current solution isn't, perfect, it doesn't correctly update the other format-ranges. But I guess is sufficient for now, to improve security related issues
What do you mean by the other format ranges? Things like homoglyphs?
The syntax highlighting is shifted for the line with bidi-characters , because the control character is becoming visibile. (When you look at the example above. The 6 becomes white instead of the comma). This is probably is fixable, by adjusting the other FormatRanges supplied to the QTextLayout.
It's even worse, it also shifts the movement of the cursor!
Yeah, I see. Well we'll expose an option to turn it off should anyone actually be working with bidirectional text - which isn't a usecase we have in Mudlet.
Could you add it a as a configurable document option, so we can allow users to change this at runtime?
I just implement it with a TextEditorConfig::renderBidiContolCharacters (which defaults to true) Further adopted another character as warning symbol: ␥ (http://unicode.org/charts/PDF/U2400.pdf) This character doesn't influence the caret-movement en syntax coloring. (So it's much better)
Further this is how it looks with renderBidiContolCharacters = false
And this with renderBidiContolCharacters = true
Also the control characters are disabled, because they are replaced. So the order of the source-code remains.
Great, thanks!
See https://www.trojansource.codes, it is possible to sneak in Unicode control characters which re-order how source code is understood - and if the editor doesn't show them (none did by default), bad things can happen.
How VScode solved it: https://code.visualstudio.com/updates/v1_62#_unicode-directional-formatting-characters