magiblot / tvision

A modern port of Turbo Vision 2.0, the classical framework for text-based user interfaces. Now cross-platform and with Unicode support.
Other
1.99k stars 150 forks source link

quick corrections for UNICODE build: use the ANSI Win32 APIs #97

Closed GerHobbelt closed 1 year ago

GerHobbelt commented 1 year ago

quick corrections for UNICODE build: use the ANSI Win32 APIs where applicable. Before this change the UNICODE build would deliver some rare cruft on the console; after this change at least the hello app functions as expected, once again.

(Background: UNICODE build was done using my own MSVC2022 project setup rig, which compiles everything in UNICODE (wide char) mode for Windows.)

TODO for later? Make tvision do the right thing and NOT use the *A Win32 APIs? 🤔 Would that be useful at all? 🤔

magiblot commented 1 year ago

Hi @GerHobbelt! I appreciate the effort.

Turbo Vision works perfectly fine if you build it with CMake, since the right compilation flags are then used.

I do not know exactly how you are building Turbo Vision, but if you are somehow setting the compilation flags manually, then you will be susceptible to running into issues like this one.

If I am not wrong, there is nothing special about Unicode builds. All that happens when building in Unicode mode is that windows.h defines some macros differently. If your code uses TCHARs, these will turn into wchar_ts; if your code contains invocations to the ambiguous version of a Win32 API function (as is the case of Turbo Vision), these will turn into invocations to the wide char version of the function. But that's all. Nothing prevents you, for example, from building the Turbo Vision library in non-Unicode mode and then building your own application in Unicode mode and linking the two together, since the Turbo Vision public API is the same in both cases.

Unicode mode is not enabled when building Turbo Vision with CMake. Turbo Vision intentionally uses the ANSI version of some Win32 API functions. Even though it had never occurred to me that it would be desirable to be able to build Turbo Vision with the Unicode mode enabled, I admit that it would be the right thing to make all these invocations explicit so that the build is unaffected by whether Unicode mode is enabled or not.

TODO for later? Make tvision do the right thing and NOT use the *A Win32 APIs? :thinking: Would that be useful at all? :thinking:

I believe that remark implies the assumption that Windows applications are supposed to use the wide char version of Win32 API functions, which is not true. The concept of Unicode and non-Unicode builds made sense at the time when it was invented, since UTF-16 was the only way to get Unicode working. However, that is no longer the case, and not building in Unicode mode or not calling the wide char version of Win32 API functions does not mean that your program isn't using Unicode.

In the case of the console functions (WriteConsoleA, FillConsoleOutputCharacterA), for example, it is possible to use UTF-8, which in the case of Turbo Vision is the most convenient thing to do.

So, in short: the solution is not to always call the wide char version of Win32 API functions, but to make calls to the ANSI version of functions explicit, which I will look into.

Cheers.

GerHobbelt commented 1 year ago

Thank you!

I do not know exactly how you are building Turbo Vision, but if you are somehow setting the compilation flags manually, then you will be susceptible to running into issues like this one.

Indeed it's in-house generated MSVC project files -- not via CMake. This is done so everything we build has precisely the same compiler setup -- which happens to include building in UNICODE (wide character) mode. While tvision isn't yet part of that, I was looking into it and making sure I don't get compile or run-time surprises. Using the xyzA() Win32 APIs in any build mode solved it for me: tvision worked properly and showed the expected demo screens with those few tweaks.

Anyway, thanks for your response (including the second part: I was still assuming some parts of Windows aren't completely UTF8 compliant yet, but I must admit it's two years ago that I last tested that assumption (Windows 10), which was failing than, at least for me: file paths with unicode characters in them, e.g. Chinese, did not always get picked up correctly in non-UNICODE MSVC builds. That's the background where this is coming from.)

Met vriendelijke groeten / Best regards,

Ger Hobbelt


web: http://www.hobbelt.com/ http://www.hebbut.net/ mail: @.*** mobile: +31-6-11 120 978

On Fri, Feb 24, 2023 at 1:36 AM magiblot @.***> wrote:

Closed #97 https://github.com/magiblot/tvision/pull/97 via bb98e5f https://github.com/magiblot/tvision/commit/bb98e5f3b84239e114bdfd7313ee59ab31acb201 .

— Reply to this email directly, view it on GitHub https://github.com/magiblot/tvision/pull/97#event-8597109347, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADCIHQEISA2AL24GNZEGSLWY77CJANCNFSM6AAAAAAVCN2234 . You are receiving this because you were mentioned.Message ID: @.***>

magiblot commented 1 year ago

Well, it is not that Windows has changed in the last two years, nor that UTF-8 can be made to work in all circumstances. The narrow char versions of console functions (e.g. WriteConsoleA) rely on the console code page, which can be set at runtime with SetConsoleCP and SetConsoleOutputCP. Conversely, functions related to the file system (e.g. FindFirstFileA) rely on the ANSI code page, which is a system-wide setting, and which is usually not the UTF-8 code page, so they are not likely to work with foreign language characters.

On the other hand, standard library functions and classes (e.g. std::ifstream) will handle UTF-8 paths properly as long as you set a UTF-8 locale with setlocale.

Therefore, Turbo Vision uses the narrow char version of console functions and the standard library functions.

The only applications which see their behaviour affected by whether the unicode mode is enabled or not are those which use TCHARs and/or the encoding-neutral aliases of Win32 API functions.

¡Un saludo! / Salutacions! / Greetings!

GerHobbelt commented 1 year ago

Ah. Thanks for clearing that one up. My code contained direct use of FindFirstFile, etc. APIs so that explains those results I got in non-UNICODE builds (Chinese filenames getting mangled, etc.) FWIW: I seem to recall fopen() et al (the C RTL) didn't work correctly with UTF8 for such filenames, but then my memory is fading/suspect and chances are that I might have made a mistake in the setlocale setup as well.

Thanks for the feedback, take care and thanks for the modernized TVision. An old love of mine. :-)

Met vriendelijke groeten / Best regards,

Ger Hobbelt


web: http://www.hobbelt.com/ http://www.hebbut.net/ mail: @.*** mobile: +31-6-11 120 978

On Fri, Feb 24, 2023 at 9:43 PM magiblot @.***> wrote:

Well, it is not that Windows has changed in the last two years, nor that UTF-8 can be made to work in all circumstances. The narrow char version of console functions (e.g. WriteConsoleA) rely on the console code page, which can be set at runtime with SetConsoleCP and SetConsoleOutputCP. Conversely, functions related to the file system (e.g. FindFirstFileA) rely on the ANSI code page, which is a system-wide setting, and which is usually not the UTF-8 code page, so they are not likely to work with foreign language characters.

On the other hand, standard library functions and classes (e.g. std::ifstream) will handle UTF-8 paths properly as long as you set a UTF-8 locale with setlocale.

Therefore, Turbo Vision uses the narrow char version of console functions and the standard library functions.

The only applications which see their behaviour affected by whether the unicode mode is enabled or not are those which use TCHARs and/or the encoding-neutral aliases of Win32 API functions.

¡Un saludo! / Salutacions! / Greetings!

— Reply to this email directly, view it on GitHub https://github.com/magiblot/tvision/pull/97#issuecomment-1444449475, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADCIHXGEOQQ7AZDADSAKSLWZEMOVANCNFSM6AAAAAAVCN2234 . You are receiving this because you were mentioned.Message ID: @.***>

magiblot commented 1 year ago

UTF-8 support in the standard library is a more recent feature (2018) and it is explained in https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170#utf-8-support.