When a Unicode char with high ord() (e.g. many emojis) is included in the source text, get_selection() counts it as 2 instead of 1 (as len() correctly does). Text selected after the included high char is off by 1.
prints 'editor' when the first occurrence of editor (on the import line) is selected in the editor, but 'ditor.' for the other two (on the print line).
I found this when brewing a little Unicode utility for the wrench menu and running into it.
This issue surfaces differently with editor.get_line_selection(), i.e. if the selection runs until the end of the editor file (which contains a high char), editor.get_line_selection() raises an IndexError.
By the way, this also seems to impact Pythonista's editor internals.
If you:
position the cursor just before the high char,
select 1 char right (by external keyboard shift - right arrow),
and then delete (by external keyboard backspace),
half of the selected char is deleted (violating the atomicity of Unicode chars).
This displays the edit text with a strange symbol and leaves the edit text in an illegal state, e.g. the print(...gettext()...) creates an exception on decoding it.
When a Unicode char with high
ord()
(e.g. many emojis) is included in the source text,get_selection()
counts it as 2 instead of 1 (aslen()
correctly does). Text selected after the included high char is off by 1.E.g. snippet:
prints
'editor'
when the first occurrence ofeditor
(on theimport
line) is selected in the editor, but'ditor.'
for the other two (on theprint
line). I found this when brewing a little Unicode utility for the wrench menu and running into it.I made a work-around at this Gist snippet.
This issue surfaces differently with
editor.get_line_selection()
, i.e. if the selection runs until the end of the editor file (which contains a high char),editor.get_line_selection()
raises anIndexError
.By the way, this also seems to impact Pythonista's editor internals. If you:
half of the selected char is deleted (violating the atomicity of Unicode chars). This displays the edit text with a strange symbol and leaves the edit text in an illegal state, e.g. the
print(...gettext()...)
creates an exception on decoding it.iPad, Python 3.6, latest beta, latest iOS.