XhmikosR / notepad2-mod

LOOKING FOR DEVELOPERS - Notepad2-mod, a Notepad2 fork, a fast and light-weight Notepad-like text editor with syntax highlighting
https://xhmikosr.github.io/notepad2-mod/
Other
1.45k stars 270 forks source link

Input in ANSI Encoding #173

Open alekhe opened 8 years ago

alekhe commented 8 years ago

Notepad2-mod.4.2.25.985_x64.

When ANSI encoding is active, input characters are not recognized—the text is shown in question marks. When OEM or UTF-8 encoding is active, everything is OK.

My ANSI encoding is Windows-1251.

Not in Notepad2-mod.4.2.25.980. notepad2-mod 4 2 25 985_x64

XhmikosR commented 8 years ago

I've noticed that too, but happens only on Windows 10. No idea why.

PinoTucana commented 8 years ago

F8, than choose one encoding which can show your text normally, OK File, endoding, change to Unicode, Ctrl +S 2016-september-25 1474811735

XhmikosR commented 8 years ago

That is another thing. Before Windows 10, this worked for ANSI.

jberezanski commented 8 years ago

I've got a somewhat similar problem, too. With the OS configured for Polish everything (keyboard layout, regional settings, language for non-Unicode programs etc.), on Notepad2-mod 4.2.25.985 with ANSI encoding active, typing Polish accented letters (AltGr+L,A,S,C...) results in plain Latin characters being input ("lasc" instead of "łąść"); only one Polish letter works correctly - "ó" (AltGr+O). This happens on Windows 8.1, Windows Server 2012 R2 and Windows 10 1607. The problem did not appear with the previous Notepad2-mod version (4.2.25.980) on either of those OSes.

XhmikosR commented 8 years ago

Then I guess it's https://github.com/XhmikosR/notepad2-mod/commit/7529a6b906f1739188dd31c60af8fd742a7acd20?w=1 and specifically https://github.com/XhmikosR/notepad2-mod/commit/7529a6b906f1739188dd31c60af8fd742a7acd20?w=1#diff-3116791c8fff3a31032997bb45378493R1148

Not sure how to proceed. I mean, I find the change right.

jberezanski commented 8 years ago

The change may be right from the Scintilla component point of view, but not neccessarily from the point of view of the entire application.

For a Windows editor, I would expect the term "ANSI encoding" to mean "encoding in the current system default ANSI code page, as returned by the GetACP() function". All editors I've been using worked that way. I would also expect the editor to use this code page by default when opening text files that do not have an unambiguous encoding marker (the UTF-8 or UTF-16 BOM). At present, notepad2-mod .985 exhibits mixed behavior: it reads such files using the system code page (1250 in my case) and the content is displayed correctly, but then I am unable to enter new accented letters, as I described in the previous post.

Perhaps Scintilla can be configured to use a specific default code page (the output of GetACP()) at run time? (I know nothing about its internals or its API.)

XhmikosR commented 8 years ago

You are welcome to submit a PR in order to have the same behavior as before. Otherwise maybe it's time to revisit setting UTF-8 as default.

jberezanski commented 7 years ago

Thinking about it some more, I no longer believe the change in Scintilla was right at all. Their release notes mention crashes on DBCS systems, but crippling ANSI text editing on all non-US systems is not what I would call a good workaround. I looked at Scintilla API and did not see a way to set an explicit non-DBCS code page (SCI_SETCODEPAGE is intended only for setting DBCS code pages and setting it turns off many Scintilla features).

And UTF-8 might be a good default, but it does not replace the ability to edit text files in native system code page.

I'll work on a PR once I finish rebuilding my dev machine (i.e. in a week or so).

jberezanski commented 7 years ago

That took a bit longer than a week, but the PR is ready now.

XhmikosR commented 7 years ago

I've already switched to UTF-8 default encoding. And patching Scintilla isn't something I like.

jberezanski commented 7 years ago

I thought you'd be okay with that, as you wrote before. Changing the default only sort of avoids the issue for new files, but it does not bring back the ability to edit files saved in the system default non-unicode code page. Right now, even the UI is misleading: on a system with the default code page set, for example, to 1250 (Eastern European languages), the encoding selection dialog in Notepad2-mod shows "ANSI (Windows-1250)". Yet, if this option is chosen, Scintilla uses the hardcoded 1252 code page. It is not possible anymore to configure Notepad2-mod to automatically use the system default code page.