microsoft / STL

MSVC's implementation of the C++ Standard Library.
Other
10.21k stars 1.51k forks source link

`StlLCMapStringA.cpp`: Does `__crtLCMapStringA()` have bogus error paths? #4984

Open StephanTLavavej opened 1 month ago

StephanTLavavej commented 1 month ago

While fixing a CodeQL warning in __crtLCMapStringA(), I'm seeing code that makes no sense:

https://github.com/microsoft/STL/blob/1e312b38db8df1dfbea17adc344454feb8d00dd9/stl/src/StlLCMapStringA.cpp#L28-L122

Specifically, every single return retval; (except for the last one, which is the success path) looks like it's a failure case where we should actually be returning 0 (failure).

Early on, if we can't allocate an inwbuffer, we return 0, which makes total sense: https://github.com/microsoft/STL/blob/1e312b38db8df1dfbea17adc344454feb8d00dd9/stl/src/StlLCMapStringA.cpp#L62-L65

But later, if we can't allocate an outwbuffer, we return retval;, which makes no sense whatsoever: https://github.com/microsoft/STL/blob/1e312b38db8df1dfbea17adc344454feb8d00dd9/stl/src/StlLCMapStringA.cpp#L99-L102

Similarly for the other cases.

Notes:

  1. This code is very old and somewhat scary. We shouldn't mess with it until we have a crystal-clear understanding of what's going on, and ideally test cases that exercise various codepaths. Our test coverage here is very minimal.
    • I'm filing this issue as a reminder to thoroughly investigate this in the future - I specifically don't want to see a PR that just mechanically replaces lines with return 0; without a thorough analysis and evidence that this won't damage behavior in some way.
  2. We were confused long ago, and dllexported this (marked as _CRTIMP2), but it's used only within the STL and is not user-visible. I believe it's never been user-visible in the VS 2015+ era (need to confirm this). If it was never user-visible, then we need to maintain its signature for dllexport validation but we need not preserve its behavior.
    • I really don't like this function (beginning with the totally incorrect comment "Tries to use NLS API call LCMapStringA"), especially how it calls LCMapStringEx for LCMAP_SORTKEY (this is what I'm looking at for CodeQL). Only xstrxfrm.cpp has 2 calls with LCMAP_SORTKEY, so if we ever gain the understanding/courage to mess with this, we should consider splitting up this function's behavior.
cpplearner commented 1 month ago

__crtLCMapStringA is used by _Strxfrm, _Toupper, and _Tolower.

https://github.com/microsoft/STL/blob/1e312b38db8df1dfbea17adc344454feb8d00dd9/stl/src/xstrxfrm.cpp#L77-L90

StephanTLavavej commented 1 month ago

is expected to return the required size regardless of whether it can write to the buffer.

Ah, but we pass 0 for cchDest when we're doing that. Thus, if (0 != cchDest) is inactive in the SORTKEY block, which otherwise does nothing:

https://github.com/microsoft/STL/blob/1e312b38db8df1dfbea17adc344454feb8d00dd9/stl/src/StlLCMapStringA.cpp#L80-L93

We then fall through to the end of the function, where we use the "successful" return retval; - that's the only one I don't think is bogus.

AlexGuteniev commented 1 month ago

Quote from the offline documentation I still use when the offline is bogus (as I thing currently happens)

Return Values Returns the number of characters or bytes in the translated string or sort key, including a terminating null character, if successful. If the function succeeds and the value of cchDest is 0, the return value is the size of the buffer required to hold the translated string or sort key, including a terminating null character.

This function returns 0 if it does not succeed. To get extended error information, the application can call GetLastError. GetLastError can return one of the following error codes:

  • ERROR_INSUFFICIENT_BUFFER
  • ERROR_INVALID_FLAGS
  • ERROR_INVALID_PARAMETER