this will define things like __host__ to be empty when we detect that we're not compiling with nvcc in order to silence any errors about unrecognized symbols that would occur from host compilers that don't understand what __host__ is.
That is a reasonable thing in many cases, but there are legitimate use cases where this can break. For example, the following file compiled with a host compiler will fail to compile:
This will fail due to a redefinition error of __host__ and friends because cuda/std/type_traits will detect we're not compiling with nvcc and define __host__ to empty, but then cuda_runtime_api.h will also define __host__, causing the redefinition error.
One could reasonably argue that the runtime headers should better guard against this situation by undefining __host__ before defining it. However, it is most expedient for us to fix this in libcu++.
The only robust solution I can think of is to avoid ever defining __host__/__device__/__forceinline__ ourselves and instead use our own macros like:
And use those in place of any where we'd use __host__ (or other macros).
I'd expect we should rarely be using host/device annotations directly anyways. Those should come from __LIBCUDACXX_INLINE_VISIBILITY or just _LIBCUDACXX_EXECUTION_SPACE_SPECIFIER.
In our
__config
we have the following:https://github.com/NVIDIA/libcudacxx/blob/b8b37d6784188cf4e64eb2f04d89797e8f77bea1/include/cuda/std/detail/__config#L14-L39
this will define things like
__host__
to be empty when we detect that we're not compiling with nvcc in order to silence any errors about unrecognized symbols that would occur from host compilers that don't understand what__host__
is.That is a reasonable thing in many cases, but there are legitimate use cases where this can break. For example, the following file compiled with a host compiler will fail to compile:
This will fail due to a redefinition error of
__host__
and friends becausecuda/std/type_traits
will detect we're not compiling with nvcc and define__host__
to empty, but thencuda_runtime_api.h
will also define__host__
, causing the redefinition error.One could reasonably argue that the runtime headers should better guard against this situation by undefining
__host__
before defining it. However, it is most expedient for us to fix this in libcu++.The only robust solution I can think of is to avoid ever defining
__host__
/__device__
/__forceinline__
ourselves and instead use our own macros like:And use those in place of any where we'd use
__host__
(or other macros).I'd expect we should rarely be using host/device annotations directly anyways. Those should come from
__LIBCUDACXX_INLINE_VISIBILITY
or just_LIBCUDACXX_EXECUTION_SPACE_SPECIFIER
.