JacksonAllan / CC

A small, usability-oriented generic container library.
MIT License
171 stars 6 forks source link

Support for MSVC 19.39 (or later) #9

Closed Hizuru3 closed 1 month ago

Hizuru3 commented 5 months ago

Thank you for the cool library and the nice article!

Recently, MSVC has finally started to support the __typeof__ operator (https://learn.microsoft.com/en-us/cpp/c-language/typeof-c?view=msvc-170). It would be good if CC library could be compiled with the newer version of MSVC. However, there are still some compatibility problems with the compiler:

  1. MSVC's doesn't define max_align_t.

  2. __builtin_unreachable() is not available on MSVC.

  3. MSVC treats char and signed char as the same type so that they can't coexist in an association list of _Generic. (Probably, that doesn't conform the C standard)

  4. The old buggy preprocessor can't handle variadic macros properly. (It matters only in C++ since the "updated preprocessor" seems to be used in c11/c17/clatest mode by default.)

I'm writing a PR that resolves the issues listed above except 4. To manage the preprocessor problem, more work will be needed.

Thanks for reading.

JacksonAllan commented 5 months ago

Hi! Thanks for the input!

typeof was standardized in C23, so my intention was already to introduce MSVC support once Microsoft catches up. I didn’t realize that this change had already occurred.

Regarding the other compatibility problems you found:

MSVC's doesn't define max_align_t.

MSVC has boasted C11 support for a while, so it’s rather surprising that it’s missing max_align_t. In practice, _Alignof( max_align_t ) can be substituted with a best-guess approach that takes the maximum alignment of a range of fundamental datatypes, as we used to do before C11. But there might be a better solution - I'll look into it.

__builtin_unreachable() is not available on MSVC.

If I recall correctly, __builtin_unreachable() is only used once and for the purpose of silencing a superfluous compiler warning. I’m surprised to see now that I didn’t include a comment about this matter in the code. In any case, the associated check can probably just be #ifdefed out for non-__GNUC__ compilers.

MSVC treats char and signed char as the same type so that they can't coexist in an association list of _Generic. (Probably, that doesn't conform the C standard)

Yes, MSVC is sadly non-conforming in this regard, and Microsoft has (last time I checked) expressed no intention to fix the issue. But it’s possible to work around it using either nested _Generic expressions or a dummy type. CC is already using the latter solution for size_t.

The old buggy preprocessor can't handle variadic macros properly. (It matters only in C++ since the "updated preprocessor" seems to be used in c11/c17/clatest mode by default.)

Is this still an issue when compiling in C++11 and later (the minimum that CC already requires)?

Hizuru3 commented 5 months ago

Thank you for your response.

Actually, I already wrote a PR for the issues (https://github.com/JacksonAllan/CC/pull/10). Sorry for messing up. There may be better solutions and it needs to be tested anyway.

Is this still an issue when compiling in C++11 and later (the minimum that CC already requires)?

Yes, it seems so. Compiling as a C++ code, you need the /Zc:preprocessor option. Specifying the language standard (/std:c++14, /std:c++20, etc.) is insufficient.

according to this page: https://learn.microsoft.com/en-us/cpp/build/reference/zc-conformance?view=msvc-170

/Zc:preprocessor[-] Use the new conforming preprocessor. Off by default unless /std:c11 or later is specified.

The following code is indeed compilable as C with /std:c11 but refuses to be compiled as C++ even with /std:c++20.

#include <stdio.h>

#define GET_FIRST_( first, ... ) first
#define GET_FIRST( ... ) GET_FIRST_( __VA_ARGS__ )

int main( void ) {
    int x = GET_FIRST(5, 10);
    printf("%d\n", x);
    return 0;
}
JacksonAllan commented 5 months ago

I had a chance to look more into these matters today, as well as to read over your PR (thanks!). A few thoughts:

Regarding max_align_t, it seems that for MSVC in C++ mode, alignof( std::max_align_t ) is only 8 (which is also what we will get by typedefing cc_max_align_t as long double, as in the PR). For GCC and Clang, it is 16. Might it be better to assume a maximum alignment of 16 even for MSVC in order to ensure that CC containers can be used with SIMD types?

Regarding the nonconforming preprocessor, I think it would indeed be good to support it if possible. One of the "conveniences" of CC is supposed to be that it doesn't require users to add any special compiler flags to their builds (besides specifying C11/C++11 or later). A quick experiment showed that your trivial example builds fine if we modify it as follows:

#include <stdio.h>

#define MSVC_BUG_FIX( macro, args ) macro args

#define GET_FIRST_( first, ... ) first
#define GET_FIRST( ... ) MSVC_BUG_FIX( GET_FIRST_, ( __VA_ARGS__ )  )

int main(void)
{
    int x = GET_FIRST( 5, 10 );
    printf( "%d\n", x );
    return 0;
}

But I'll need to check to see in what other cases __VA_ARGS__ is used in CC and whether any other issues arise.

Regarding the char/signed char clash, your nested _Generic solution looks good in principle, although I'll need to test it, and it's a little incongruent with the way the similar issue with size_t is currently handled. Ideally, these two issues would be handled the same way. So I'll experiment to see whether the handling of size_t can be changed to follow your solution for handling signed char, or vice versa.

Regarding the MSVC bit-scan optimizations (for iterating maps and sets), these were actually already implemented and thoroughly tested in Verstable, so I'll likely lift the code straight from there.

Hizuru3 commented 4 months ago

Sorry for the hiatus. I have some comments:

Regarding max_align_t, it seems that for MSVC in C++ mode, alignof( std::max_align_t ) is only 8 (which is also what we will get by typedefing cc_max_align_t as long double, as in the PR). For GCC and Clang, it is 16. Might it be better to assume a maximum alignment of 16 even for MSVC in order to ensure that CC containers can be used with SIMD types?

On the 64-bit environment, it's probably OK to have a 16-byte alignment. On 32-bit Windows, only 8-byte alignment is guaranteed by the default allocator (i.e. malloc/realloc) anyway so that we can't apply the same assumption there but I'm not sure how to handle these cases.

It seems your approach that forces to expand variadic arguments works fine in most cases but a problem happens to arise when combining with the string concatanating macro, as in CC_SELECT_ON_NUM_ARGS() :(

Please let me note that builtin_clzll() and _BitScanReverse64() work in a kinda different way. For example, builtin_clzll(160) returns 56 (assuming long long has 64 bits). As for _BitScanReverse64(&index, 160), index gets 7. In order to get a same result, you have to take xor with 63 for one's output.

JacksonAllan commented 3 months ago

Sorry for the very late response! I was busy preparing this article, which benchmarks CC's map against a range of other hash tables.

The next project on the to-do list is to introduce MSVC support for CC. I'm not sure how long it will take, but I'll be looking closely (again) at your PR request in coming days.

Please let me note that builtin_clzll() and _BitScanReverse64() work in a kinda different way. For example, builtin_clzll(160) returns 56 (assuming long long has 64 bits). As for _BitScanReverse64(&index, 160), index gets 7. In order to get a same result, you have to take xor with 63 for one's output.

Thanks! After a bit of reading, I understand what you're talking about. I'm not sure what you mean by "take xor with 63 for one's output", but it looks like the common solution is to subtract the result from 63. So there is a bug in Verstable for big-endian machines that could be fixed thusly:

static inline int vt_first_nonzero_uint16( uint64_t val )
{
  unsigned long result;

  const uint16_t endian_checker = 0x0001;
  if( *(char *)&endian_checker )
    _BitScanForward64( &result, val );
  else
-   _BitScanReverse64( &result, val );
+ {
+   _BitScanReverse64( &result, val );
+   result = 63 - result;
+ }

  return result / 16;
}

(Note that vt_first_nonzero_uint16 assumes that the input value is not zero, so that special case doesn't need to be handled.)

It's good to know about this bug before I accidentally convey it over to CC.

JacksonAllan commented 3 months ago

I'm working on MSVC support now based on your pull request, which I merged into the "dev" branch.

Regarding this part, which handles the situation wherein stdalign.h does not exist under MSVC:

// stdalign.h may be not recognized under some conditions.
// As the header file is very small, just emulate it here.
#ifndef __cplusplus
#define alignas _Alignas
#define alignof _Alignof
#endif
#define __alignas_is_defined 1
#define __alignof_is_defined 1

I remember running into this missing header issue in the past, around the time that MSVC first started boasting C11/C17 support. However, it looks like the header is present now via Windows SDK 10.0.20348.0 (i.e. version 2104, released 2020) or later.^1 Since CC already requires a much more recent version of the MSVC compiler for typeof, maybe we don't need to handle this special case? In other words, perhaps we can assume that any user running a version of MSVC from 2024 or later would also have a Windows SDK version from 2020 or later? Anecdotally, I had to upgrade Visual Studio to get typeof, and I now have stdalign.h too.

https://github.com/microsoft/vcpkg/issues/25654#issuecomment-1180039747

JacksonAllan commented 2 months ago

Quick update: MSVC support is now available on the dev branch. This includes support for MSVC’s nonconforming preprocessor when compiling in C++ with MSVC's default settings.

Changes in relation to the original pull request:

I’ll be releasing the new version with this feature soon.

JacksonAllan commented 2 months ago

I've just released the new version with MSVC support. I'll leave this issue open for the moment in case there are any comments.