Closed Robadob closed 3 years ago
Thanks for this!
Can you check if changing these two lines: https://github.com/NVIDIA/jitify/blob/788039b/jitify.hpp#L1924-L1925
to USHRT_MIN
and USHRT_MAX
instead (so that they're unsigned instead of signed) fixes it?
According to this page, wchar_t
on Windows is supposed to be unsigned, which I think is the root of the bug. If the constant values are correct then it shouldn't complain about narrowing conversion any more.
Changing those has no visible impact to the original bug, get the same "conversion from int to wchar_t" error.
Even making them include a cast, made no difference. Suggesting they're not being pulled from there?
"#define WCHAR_MIN (wchar_t)USHRT_MIN\n"
"#define WCHAR_MAX (wchar_t)USHRT_MAX\n"
It appears "#if defined _WIN32 || defined _WIN64\n"
is not being used, so they're being taken from the #else
case.
If I instead change the else case
The error changes to this.
---------------------------------------------------
--- JIT compile log for my_program ---
---------------------------------------------------
limits(124): error: identifier "USHRT_MIN" is undefined
1 error detected in the compilation of "my_program".
---------------------------------------------------
I guess USHRT_MIN doesn't have a define as it should always be 0 for any unsigned vals.
So this version works, however i've had to update the wrong branch of the #if
. So there's a second issue with #if defined _WIN32 || defined _WIN64
not automatically triggering on a windows build.
Changing it to "#if defined(_WIN32) || defined(_WIN64)\n"
(the format I'm more familiar with) doesn't help, so I guess Jitify/NVRTC isn't defining them internally?
NVRTC is supposed to define _WIN64
.
But evidently it doesn't. The below version, where I manually define _WIN64
inline does work, but obviously isn't a viable fix.
Hmm, so _WIN64
is first documented as predefined in CUDA 11.0 (based on @ptheywood 's research).
Did notice, that despite building with CUDA 11.2 and providing CUDA 11.2 include path, I was linking against CUDA 11.0's nvrtc.
On updating to link against 11.2, so I could test that. Now getting an exception thrown by Jitify due to an NVRTC error NVRTC_ERROR_BUILTIN_OPERATION_FAILURE
at jitify.hpp#L2836
Realised when I manually set the PATH variables for CUDA 11.2, had left in ;
which was causing weird behaviours.
Thanks for digging into this. Is _WIN64
defined for you when using CUDA 11.0?
We can probably replace the OS-check with a sizeof(wchar_t)
check instead to avoid this problem.
Ok starting from the beginning, using CUDA 11.2, and NVRTC 11.2.
The original version produces this error, same as the error reported in #85, except reported once rather than twice.
---------------------------------------------------
--- JIT compile log for my_program ---
---------------------------------------------------
limits(124): error: invalid narrowing conversion from "int" to "wchar_t": constant value does not fit in destination type
1 error detected in the compilation of "my_program".
---------------------------------------------------
Your proposed fix, over changing L1924-1925 to USHRT_MIN
, USHRT_MAX
also fails, as limits.h
doesn't define unsigned min values (besides int?).
---------------------------------------------------
--- JIT compile log for my_program ---
---------------------------------------------------
limits(124): error: identifier "USHRT_MIN" is undefined
1 error detected in the compilation of "my_program".
---------------------------------------------------
Replacing USHRT_MIN
with 0, does however work.
Unclear whether this also works with CUDA 11.0 (and my mix/match of cuda 11.0 and 11.2 was causing the predef macro to go AWOL). I'll update the PR commit and then try and test it with cuda 11.0.
So the current version in this PR works for CUDA 11.2, but not 11.0. For some reason 11.0 does not predefine _WIN64
macro, despite it being documented.
I haven't got 11.1 installed, so can't test that.
We can probably replace the OS-check with a
sizeof(wchar_t)
check instead to avoid this problem.
To the best of my understanding preprocessor can't evaluate sizeof()
, similarly a quick test of #if sizeof(wchar_t)==2
was simply treated as false.
It could be easier to move the #if
out of the string, so it's checked against the host compiler instead.
The second commit applies this change, and works for me with both CUDA 11.0 and 11.2 (on Windows).
Clarity on the 11.0.
Docs archive link for 11.0 docs is for 11.0.3 (has same link as 11.0 update 1). My installed 11.0 docs state nvrtc version 11.0,182, and do not include _WIN64
in the documented predefines.
Using the host compiler macro looks like a good solution. I think it would be best to put this in the preinclude header here so that it fixes all usages:
// WAR to allow exceptions to be parsed
#define try
#define catch(...)
)"
#if defined(_WIN32) || defined(_WIN64)
// WAR for NVRTC <= 11.0 not defining _WIN64.
R"(
#ifndef _WIN64
#define _WIN64 1
#endif
)"
#endif
;
(I haven't tested this on Windows myself yet; let me know if you see anything wrong with it).
A quick test shows that works as expected, fixing my CUDA 11.0.
Have reverted the previous fix and added your change instead.
Let me know if you want me to squash. all the changes into a single commit/remove the two commits that cancel etc.
A squash or rebase to remove the canceled commits would be great, thanks. Apart from that this looks good.
Have rebased, also re-ordered the 2 commits as they're technically separate issues but one is dependent on the other.
This adds an explicit cast to the two literal values which building with
<limits>
on windows complains about.Tested that it fixes the reported error in #85, by adding
"#include <limits>\n"
totest_simple
ofjitify_example.cpp
. Haven't tested it on Linux, where presumably<limits>
was already working.Closes #85