Utf8 text breaks syntax highlighting

davidm / lua-inspect

Lua code analysis, with plugins for HTML and SciTE

http://lua-users.org/wiki/LuaInspect

Other

170 stars 15 forks source link

Utf8 text breaks syntax highlighting #1

Closed mkottman closed 14 years ago

mkottman commented 14 years ago

When using lua-inspect in SciTE, when encountering Unicode text encoded in utf8, the syntax highlighting following the text is offset to where it is supposed to be - screenshot demonstrating the problem.

Also, the output pane of SciTE shows the following error:

lua-inspect/luainspectlib/luainspect/scite.lua:885: bad argument #1 to 'char' (invalid value)

xolox commented 14 years ago

If David decides to add Unicode support to LuaInspect then I think UTF-8 would be the best choice. On the other hand this might complicate the code quite a lot because it uses byte offsets everywhere and (apparently) the highlighting performed in scite.lua assumes that one byte = one character.

davidm commented 14 years ago

I think this is fixed with only a few lines of code:

http://github.com/davidm/lua-inspect/commit/78daa89b23e827c09db0ef0dfb8f301b82d2e22e http://github.com/davidm/lua-inspect/commit/e8aa65d1a535b53a4d3821fdbed0c256d3d90a59

SciTE, like LuaInspect, generally treats a document as bytes rather than characters. The SciTE lexer, however, does deal in terms of characters ( http://www.scintilla.org/ScriptLexer.html ), but their byte length can be measured to avoid the problem above.

davidm commented 14 years ago