Many non-Unicode text files on non-English systems are encoded in the
system-default code page. Users expect to be able to edit such files.
However, in version 3.6.7, Scintilla developers decided to break this
scenario by equating the default (=unspecified) code page with code page
1252 (Western European). This causes Scintilla to mistreat international
characters typed by the user - they either show as non-accented latin
letters or as question marks. The only way to avoid this behavior in
Notepad2-mod is to set the file encoding manually.
Internally, Notepad2-mod attempts to do the right thing. The encoding
described in the UI as "ANSI" is internally mapped to CPI_DEFAULT and
Notepad2-mod treats it as using the system default code page, as
evidenced by code which adds to the description of this encoding the
output of the GetACP() Win32 function
(Edit.c, function Encoding_InitDefaults()).
So, for example, on a Polish system the ANSI encoding option (in the
encoding selection dialogs) is shown as "ANSI (1250)". Due to the change
in Scintilla, however, this is no longer accurate - Scintilla will not
use code page 1250 (the default code page on that system), but the
hardcoded 1252.
In Scintilla change history, the change in 3.6.7 is described as
"[preventing] unexpected behavior and crashes on East Asian systems".
It is the opinion of this developer that using the system default code
page by default is, in fact, the expected behavior from the user point
of view (and Notepad2-mod is perfectly capable of handling multi-byte
encodings correctly), so the reasoning for the change is invalid and the
change should be reverted. Which this commit does.
(For comparison, the other popular Scintilla-based editor, Notepad++,
currently uses an older Scintilla version (3.5.6), so it did not
encounter this issue yet.)
Many non-Unicode text files on non-English systems are encoded in the system-default code page. Users expect to be able to edit such files.
However, in version 3.6.7, Scintilla developers decided to break this scenario by equating the default (=unspecified) code page with code page 1252 (Western European). This causes Scintilla to mistreat international characters typed by the user - they either show as non-accented latin letters or as question marks. The only way to avoid this behavior in Notepad2-mod is to set the file encoding manually.
Internally, Notepad2-mod attempts to do the right thing. The encoding described in the UI as "ANSI" is internally mapped to CPI_DEFAULT and Notepad2-mod treats it as using the system default code page, as evidenced by code which adds to the description of this encoding the output of the GetACP() Win32 function (Edit.c, function Encoding_InitDefaults()). So, for example, on a Polish system the ANSI encoding option (in the encoding selection dialogs) is shown as "ANSI (1250)". Due to the change in Scintilla, however, this is no longer accurate - Scintilla will not use code page 1250 (the default code page on that system), but the hardcoded 1252.
In Scintilla change history, the change in 3.6.7 is described as "[preventing] unexpected behavior and crashes on East Asian systems". It is the opinion of this developer that using the system default code page by default is, in fact, the expected behavior from the user point of view (and Notepad2-mod is perfectly capable of handling multi-byte encodings correctly), so the reasoning for the change is invalid and the change should be reverted. Which this commit does.
(For comparison, the other popular Scintilla-based editor, Notepad++, currently uses an older Scintilla version (3.5.6), so it did not encounter this issue yet.)
Fixes #173.