LuaLS / lua-language-server

A language server that offers Lua language support - programmed in Lua
https://luals.github.io
MIT License
3.32k stars 311 forks source link

Feature Request: Support for parsing UTF-16 LE Lua files #508

Closed EvanCox14 closed 2 years ago

EvanCox14 commented 3 years ago

Describe the bug When parsing files in UTF-16, the syntax checker outputs hundreds of errors for "unexpected symbol" and "unexpected \<exp>" for a simple 20 line file. But once these files are opened directly, the errors disappear. But as soon as the checker reruns on the entire workspace, they all reappear for any file that is not open. In a workspace with over 1000 Lua files in UTF-16 (many of which are translation files), I get over 500K+ errors. It's entirely unmanageable to sift through the problems and find real errors.

To Reproduce

  1. Create a new workspace in VSCode with the lua extension installed.
  2. Create two "hello world!" style files. Save one in utf8, and the other in utf16.
  3. After explicitly closing the utf16 file so that it doesn't reopen afterwards, close and reopen the workspace.
  4. The problem tab will show many errors as seen in the screenshot.
  5. After opening the utf16 file directly, the problems disappear.

Expected behavior It would be desirable if the problems never appeared if they disappear after being directly opened.

Screenshots After closing and reopening the workspace: image

After opening the utf16 file: image

Environment

Additional context I cannot change the encoding of the files in my project because the Lua files are part of the game files that I cannot and should not change on my machine. I am trying to use this extension to help me write a mod for the game "Forts." Forts uses a variant of Lua called "LuaPlus" which adds a custom value type called "Wide Character Strings." http://wwhiz.com/LuaPlus/LuaPlus.html#WideCharacterStrings

Provide logs Log of the reproduction test case on my machine: lua_test.log

sumneko commented 3 years ago

Currently I have no way to support utf-16 (when you open the file with VSCode, VSCode will convert it to utf-8 and send it to me). You can first set Lua.workspace.maxPreload to 0 to prevent pre-reading lua files in the workspace.

EvanCox14 commented 3 years ago

Is there anything that I could do to investigate the issue and work towards a resolution on my own? If it were possible, where in the code would I start looking to make changes?

I appreciate the tip about setting maxPreload to 0 to avoid the errors. I may try that. But I think I would lose many of the features I chose this extension for, since there are many files that set many global values.

sumneko commented 3 years ago

https://github.com/sumneko/lua-language-server/blob/1c1b62b4343ecb2f360a9ce81524962bf2e80f55/script/files.lua#L167-L169

You can refer to here. The main problem is that I don't have the utf16 encoding conversion library used by Lua.

EvanCox14 commented 3 years ago

Thanks! I'll look into this tomorrow.

EvanCox14 commented 3 years ago

I see now how difficult this task is. Dealing with file encodings is a lot of trouble. I see c++ methods in the bee submodule for converting between ansi, wide strings, and unicode. From my research, I think wide strings may be utf-16 in some cases. If it were possible, I think the simplest way to do it might be to bind those methods into Lua and call them for utf-16 conversion. Though I doubt the work would be as simple as I describe it.

sumneko commented 3 years ago

You may open a PR