Missing symbols in MSVC C++ compilation, many headers fail to compile

lpisha commented 1 year ago

Simple reproducer and more info at this repo: https://github.com/lpisha/test_cudastd_msvc

This is in the context of projects which contain both C++ and CUDA source files, where the C++ files are compiled with MSVC and the CUDA files with nvcc. The project uses the CUDA Standard Library across host and device code, so common headers which get included by both .cpp and .cu files include parts of the CUDA Standard Library.

During C++ compilation, two headers in the CUDA Standard Library, <cuda/std/limits> and <cuda/std/atomic>, expect that MSVC will have declared / defined certain symbols, but it has not, causing compilation errors. About half of the headers in the CUDA Standard Library include these two headers, making much of the library unusable.

This bug does not arise in the compilation of .cu files with nvcc, only in the compilation of .cpp files with MSVC.

The issues with <cuda/std/limits> were introduced between VS 2019 and VS 2022 due to changes to the MSVC headers. The issues with <cuda/std/atomic> do not appear to be version-related; it appears that the libcudacxx code could never have worked on MSVC in the C++ context.

Please see the README at https://github.com/lpisha/test_cudastd_msvc for more info.

wmaxey commented 1 year ago

Thanks for the report! We very recently made efforts to catch issues where libcu++ breaks when NVCC is not present. I'll follow up with your reproducer to see what we may have missed.

I distinctly remember fixing a similar sounding atomic macro that was missing...

wmaxey commented 1 year ago

NVIDIA/libcudacxx#340 has changes that seem relevant.

lpisha commented 1 year ago

Thanks for the reply!

Having moved from the version of libcudacxx included with CUDA 12.0 to top-of-tree, here are the new results:

The <cuda/std/atomic> issue is now fixed, it was indeed fixed in NVIDIA/libcudacxx#340 , thank you!
The <cuda/std/limits> issue remains in VS 2022. Compilation will fail when TEST_LIMITS is set (in the reproducer) and succeed again when TEST_FIX_LIMITS is also set.
The <cuda/std/functional> issue remains in VS 2022. Compilation will fail when TEST_FUNCTIONAL is set (in the reproducer).
A new issue has emerged in <cuda/std/type_traits> in VS 2017 (<cuda/std/detail/libcxx/include/__type_traits/disjunction.h>: line 47, "error C2210: '_First': pack expansions cannot be used as arguments to non-packed parameters in alias templates"). This header is also included by <cuda/std/limits>, which is in turn included by <cuda/std/atomic> and <cuda/std/functional>, so all the tests now fail in VS 2017.

Summary of fixes needed by version:

VS 2022: incorporate TEST_FIX_LIMITS fix (or something equivalent); find a fix for <cuda/std/functional> issue
VS 2019: no issues
VS 2017: find a fix for <cuda/std/type_traits> issue

I apologize for not trying top-of-tree initially; I looked to see if the code relating to the <cuda/std/limits> issue had been changed compared to 12.0, and it had not, so I just went ahead and opened the issue without checking the atomic stuff.

I also saw NVIDIA/cccl#968 and NVIDIA/cccl#940 after posting the issue (only looked back two pages in the issues, not three 🙃 ), but I'm glad things have changed since then. The value of a library which works across host and device is substantially limited if the host code can't be compiled with a normal host compiler. If there were fundamental incompatibilities that would be one thing, but these are just a few missing symbols and tweaks to a few templates.

miscco commented 1 year ago

@lpisha thanks alot for the detailed description and reproducer. AFAIK we fixed the <limits> and <type_traits> issue and I have a PR open that addresses the issues found in <functional>

NVIDIA / cccl

Missing symbols in MSVC C++ compilation, many headers fail to compile #1006