end2endzone / ShellAnything

ShellAnything is a C++ open-source software which allow one to easily customize and add new options to *Windows Explorer* context menu. Define specific actions when a user right-click on a file or a directory.
MIT License
180 stars 27 forks source link

String encoding in source code #91

Closed Cirn09 closed 2 years ago

Cirn09 commented 3 years ago

There are too many potential coding errors in source code for example: https://github.com/end2endzone/ShellAnything/blob/master/src/Win32Clipboard.cpp#L42

  static const UINT gFormatDescriptorBinary     = RegisterClipboardFormat("Binary");
  static const UINT gFormatDescriptorDropEffect = RegisterClipboardFormat("Preferred DropEffect");

it only work right when not define UNICODE. And RegisterClipboardFormatA will convert "Binary" to L"Binary", then call RegisterClipboardFormatW. To improve compatibility, should write like RegisterClipboardFormatA("Binary"), or RegisterClipboardFormatW(L"Binary"), or more appropriate: RegisterClipboardFormat(TEXT"Binary"))

I tried to modify a part of the code, but I found that this caused coding confusion in the code. So I recommend ​using TCHAR instead of char, _stprintf_s instead of sprintf_s, TEXT("bala") instead of "bala"

#ifdef UNICODE
 ​#define tstring std::wstring
#else
 ​#define tstring std::string
#endif

or more cpp style

typedef basic_string<TCHAR, char_traits<TCHAR>, allocator<TCHAR> >  tstring;

instead of std::string (ref: https://stackoverflow.com/questions/36197514/what-is-the-macro-for-stdstring-stdwstring-in-vc)

And define UNICODE and _UNICODE

end2endzone commented 3 years ago

You are right about the confusion with the coding standard from Win32Clipboard.cpp (and Win32Registry.h, Win32Registry.cpp and Win32Clipboard.h) with the rest of the software. These files comes from another of my projects and were written in Windows 2000 or Windows XP days. The code standard and style does not exactly match the main code. In the example above, the word "Binary" only contains English letters and does not require UTF-8 or UNICODE encoding. The code should be modified to implicitly call RegisterClipboardFormatA().

  static const UINT gFormatDescriptorBinary     = RegisterClipboardFormatA("Binary");
  static const UINT gFormatDescriptorDropEffect = RegisterClipboardFormatA("Preferred DropEffect");

I don't want to sound rude but I do not understand your point about supporting the UNICODE preprocessor. The UNICODE preprocessor was created in order to be able to compile an application for ANSI code page and for Microsoft UNICODE code page. The ANSI version of ShellAnything does not exists and I do not have any plans on supporting an ANSI version of the ShellExtension.

ShellAnything is designed to be compiled without UNICODE (or _UNICODE). The main code is written to implicitly use the Wide variety of Win32 functions and not rely on macros based on UNICODE preprocessor. In order words, the code should call CreateFileW() (or in rare occasions CreateFileA()) directly instead of using CreateFile() macro. The code may still be using Win32 macros when calling a system function but each location in the code should be identified and change accordingly.

Since data in ShellAnything is stored as UTF-8 strings , using char instead of TCHAR is preferable to prevent confusion. In my experience, supporting UNICODE preprocessor is a real pain. It leads to multiple encoding bugs. Everytime you have to save a tstring to a file, you need to convert the encoding from TCHAR to char anyway. This is why I decided to use UTF-8 for encoding strings. It is not ANSI nor UNICODE (a.k.a Microsoft utf-16 ish). I do not see any benefit in using TCHAR.

Even if the Shell Extension is compiled without the UNICODE preprocessor, it is still registering itself as a wide-characters compatible application to the system. It is also using the wide variety of Win32 functions (functions names ending with W).

If you need to define a string with Chinese characters, just use the wide function and use the L"" prefix. Maybe I could better understand your need if you would give an example.

Cirn09 commented 3 years ago

ShellAnything works very well in non-English environment, and there is no compatibility problem. This issue was raised only because I was confused about the parameter types caused by functions such as GetModuleHandleEx not being used with the TEXT macro. Since there is no plan to compile the ASCII version, the suggestion of using TCHAR and tstring I mentioned before is unnecessary.

But I still recommend using a function like GetModuleHandleExA to avoid confusion when reading source code.