orbitalquark / textadept

Textadept is a fast, minimalist, and remarkably extensible cross-platform text editor for programmers.
https://orbitalquark.github.io/textadept
MIT License
636 stars 38 forks source link

Error when saving in a Windows path localised region #474

Open ie-rosie opened 10 months ago

ie-rosie commented 10 months ago

Hello,

I stumbled across this error recently and I tried to search for a solution and I think that I might have found one, even though I don't know what exactly I did. Firstly, when I try to save a text file on Windows in a path that contains Romanian characters (ă, î, â, ș, ț) such as C:\Users\user\ASE\Controlul statistic al calității, the app will give an invalid encoding error message.
I tried to edit the config files, but I quitted shortly after, because I don't know Lua programming. After that, on the Github i found this issue, where the error in question was detailed. I followed the advice from there https://github.com/orbitalquark/textadept/issues/367#issuecomment-1478240677 and tried to change the format and region of the OS to Romania and even allow UTF-8, but it didn't help. image After that, I searched the manual and found this disclaimer in the Windows Note section: The editor can only open files whose names contain characters in the system’s encoding (e.g. CP1252 for English and most European languages). So I changed the Unserialize buffers function line 71 in the session.lua config file to not_found[#not_found + 1] = buf.filename:iconv('ASCII', _CHARSET) And after this tweak I can edit and save files in paths that contain special characters. The only downside is that Textadept displays an invalid encoding(s) warning, which doesn't really affect the workflow in the app.
image Maybe the message can disappear after modifying the Serialize buffers function in the session.lua config file, but I won't play in the settings anymore as I don't have the expertise needed for this.

PS: It is my first creation of an issue, so the text might not work as intended. Edit: Actually, the text works as intended. :)

orbitalquark commented 10 months ago

Sorry that you're experiencing this issue :( Your workaround works because I think that your Romanian characters are in the extended ASCII encoding, which Lua can handle when it comes to I/O (as mentioned in your referenced issue). However, when it comes to display (UTF-8), iconv/Textadept does not know how to make the conversion, hence your screenshot.

If you revert your workaround, do things change if you disable that beta UTF-8 option in the Windows Region Settings? Also, please open the command entry (Tools > Command Entry) and type _CHARSET both with the UTF-8 option enabled and then disabled (please restart Textadept after making changes). I'd like to know what the results are. Perhaps we can identify a better workaround.

Thanks for your patience despite not knowing Lua!

georgeraraujo commented 10 months ago

Hi @orbitalquark , Would lua-unicode be a possible avenue to overcome this limitation? As per the Wireshark documentation,

Wireshark for Windows uses a modified Lua runtime (lua-unicode) to support Unicode (UTF-8) filesystem paths. This brings consistency with other platforms (for example, Linux and macOS).

orbitalquark commented 10 months ago

Thanks for the link. It may be possible to use that or something similar.

ie-rosie commented 10 months ago

Hello! Sorry for the late response!

With UTF-8 locale option enabled I have this as a result from _CHARSET CP65001 and without UTF-8 enabled I have CP1250 (using both ASCII and UTF-8 in line 71 of Unserialize buffers function of session.lua module).

I forgot to mention that the version I am using is 12.1.0.0 Stable installed using the Scoop manager.

I also used the _CHARSET command in the Nightly version of the app from 18th October. When I used it without UTF-8 enabled I got CP1250 (using both ASCII and UTF-8 in line 71 of Unserialize buffers function of session.lua module) as a result and with it enabled I got an Initialiation Error when I first opened the app. There are invalid encodings at line 71, where I have the function convert to ASCII. When running the _CHARSET command I got CP65001 as a result. And with UTF-8 changed in line 71 of the module I got CP65001 from _CHARSET.

georgeraraujo commented 10 months ago

Thanks for the link. It may be possible to use that or something similar.

Got it to work (I guess).

Here is こんにちは世界.lua:

print("Hello World")

With lua-5.4.2_Win64_bin.zip from LuaBinaries:

C:\TEMP\lua-5.4.2_Win64_bin>lua54 こんにちは世界.lua
lua54: cannot open ???????.lua: Invalid argument

With lua-unicode:

C:\TEMP\lua-unicode-master-5.4\lua-5.4.6\build64\lua-5.4.6-unicode-win64-vc14>lua54 こんにちは世界.lua
Hello World
orbitalquark commented 10 months ago

Hello! Sorry for the late response!

No problem.

With UTF-8 locale option enabled I have this as a result from _CHARSET CP65001 and without UTF-8 enabled I have CP1250 (using both ASCII and UTF-8 in line 71 of Unserialize buffers function of session.lua module).

When you disable the UTF-8 locale option, and without your Unserialize buffers change, do you still get an invalid encoding encoding error message when you try to save a text file with Romanian characters in its filename?

ie-rosie commented 10 months ago

Yes, I do. Bellow are the messages. In the first one it changes ț.txt with ?.txt. And the same thing happens with saving a file in a path with Ro characters. Line 194 is from LuaDoc is in core/.buffer.luadoc function in file_io.lua module.

_C:\Users\user\scoop\apps\textadept\current/core/file_io.lua:194: C:\Users\user\Desktop\?.txt: Invalid argument_

_C:\Users\user\scoop\apps\textadept\current/core/file_io.lua:194: C:\Users\user\MEGA\Controlul statistic al calit�?ii\Proiect\test.txt: Invalid argument_

orbitalquark commented 10 months ago

Okay, thanks for the extra information. I'll put it on my TODO list, but it's low priority since I've looked into this before with Greek and came up empty (the issue you originally linked to). I'm glad you have a workaround in the meantime, despite how ugly it might be. I do appreciate your report though!