NVIDIA / jitify

A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).
BSD 3-Clause "New" or "Revised" License
518 stars 64 forks source link

Cannot use `<limits>` and `<cuda/std/limits>` in the same source file #107

Open shwina opened 2 years ago

shwina commented 2 years ago

Invoking jitify with the following source file:

#include <limits>
#include <cuda/std/limits>

as follows:

jitify2_preprocess -std=c++11 -D__CUDACC_RTC__ test.hpp

results in:

Error processing source file test.hpp
Compilation failed: NVRTC_ERROR_COMPILATION
Compiler options: "-std=c++11 -D__CUDACC_RTC__ -include=jitify_preinclude.h -default-device"
detail/libcxx/include/limits(211): error: identifier "__CHAR_BIT__" is undefined

detail/libcxx/include/limits(312): error: identifier "__FLT_MANT_DIG__" is undefined

detail/libcxx/include/limits(313): error: identifier "__FLT_DIG__" is undefined

detail/libcxx/include/limits(321): error: identifier "__FLT_RADIX__" is undefined

detail/libcxx/include/limits(325): error: identifier "__FLT_MIN_EXP__" is undefined

<many more similar errors>

As a workaround I can do:

include <limits>
#include <cuda/std/climits>
#include <cuda/std/limits>
maddyscientist commented 2 years ago

@benbarsdell this is the same issue I reported a while ago. Did you have a chance to think about how to fix this?

benbarsdell commented 2 years ago

I'll see if I can take another look at this later this week.

bdice commented 2 years ago

@benbarsdell Hi, any updates on this? I'm reviewing https://github.com/rapidsai/cudf/pull/11287 and would like to understand the issue / what solutions might be possible.

benbarsdell commented 2 years ago

I believe the root cause of this is the #include <climits> header being loaded from jitify's builtins and cached, and then, when #include "climits" is encountered within libcu++, jitify uses the cached version instead of the new one.

The solution will be to distinguish between #include <foo> and #include "foo" in the header cache. However, it is further complicated by the fact that NVRTC does not support such a distinction. I think the only way around that will be to automatically patch #include "foo" to #include </path/to/foo> (if and only if /path/to/foo exists).

Unfortunately this is easier said than done, which is why I haven't got to it yet.

In terms of workarounds, removing #include <limits> and just using the libcu++ version should work, if that's doable in your code. There may be other workarounds too.