NVIDIA / jitify

A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).
BSD 3-Clause "New" or "Revised" License
518 stars 64 forks source link

Improve exception message #95

Closed Robadob closed 2 years ago

Robadob commented 2 years ago

Exception incorrectly named the file containing the missing include, rather than the missing include itself.

benbarsdell commented 2 years ago

This is intended. The parent is what was expected to be found but was not. The fact that include_name was not found is only a warning (below that code).

But if it's useful, you could change it to something like "<include_name> was included by <parent_name> but the latter was not loaded by Jitify! ..."

Robadob commented 2 years ago

It's certainly confusing that it's referring to the fact, that the parent source file could not be loaded due to an unnamed include.

"<parent_name> could not be loaded, as include <include_name> was not found by Jitify!" ?

benbarsdell commented 2 years ago

What happens is that we failed to load include_name, so we try to ignore it by commenting out the corresponding include directive from the file that included it (parent_name). Under normal circumstances, parent_name was previously loaded by Jitify and is in the program_sources map. However, if NVRTC loaded parent_name without Jitify's knowledge (which is supposed to be impossible), then it's not found in the map, and this exception is thrown. So the exception really means "NVRTC somehow loaded parent_name without Jitify's knowledge", not that parent_name could not be loaded. Unless I'm forgetting something, the only way this used to happen was because NVRTC used to always search /usr/include even when it wasn't specified via a -I flag. We added a workaround for it, and NVRTC also fixed the problem itself a little while ago, so I don't think it should happen any more.

Are you seeing this exception thrown?

Robadob commented 2 years ago

A user was trying to include helper_math.h, a file found in CUDA samples, not the CUDA include dir.

They were getting this exception, which I can reproduce locally by throwing #include <helper_math.h> into any working file.

I get the same exception if i change the include to helper_math_doesnotexist.h, a file which NVRTC has no reason to automatically load as it doesn't exist.

I simply noticed the exception message looked wrong (to me), whilst investigating it.

The user's original query is here: https://github.com/FLAMEGPU/FLAMEGPU2/discussions/797

Robadob commented 2 years ago

(It would appear, the automatically commenting out includes, doesn't work if the include is found in the root file and leads to this exception?)

benbarsdell commented 2 years ago

It's strange that the parent file is myAgentFunction_impl.cu, which I assume is the main source file?

I think a full path include name like #include </usr/local/cuda/include/something.h> could potentially trigger the exception if something.h included a header that couldn't be found, but I'm not sure that that's what's going on here.

Robadob commented 2 years ago

It's strange that the parent file is myAgentFunction_impl.cu, which I assume is the main source file?

Yes, this is a function a user implements (normally as a string in Python), the name assigned to it is somewhat arbitrary, only relevant if we export the file to disk for debugging/profiling reasons.

We prepend several includes to their function src, but otherwise it's unmodified prior to passing it to Jitify.

benbarsdell commented 2 years ago

Ahh I think found the problem: https://github.com/FLAMEGPU/FLAMEGPU2/blob/27eec1cacf35051e524732cf1265733e45d1ca3f/src/flamegpu/model/AgentFunctionDescription.cpp#L530

The error message that Jitify parses contains the filename from the #line directive, which in this case doesn't match the program name given to Jitify/NVRTC (which is what is used in the program_sources map). If you remove the #line directive or make the filename match the program name then it should fix it.

I'll have a think about whether there's anything we can do on the Jitify side. I'm already working on a new way to do the header processing, so the problem might go away once that's finished.

Robadob commented 2 years ago

Ah, I'd forgotten about that. Yes that line is so that the profiler lines up with the file we export or compilation errors line up with the users source (as we inject includes above). The solution would be to changed the filename in the line above to match #line (although I presume that makes the inclusion of a filename with #line redundant).

Robadob commented 2 years ago

Although that #line statement is only included when OUTPUT_RTC_DYNAMIC_FILES is passed to the preprocessor. I've not currently got this enabled, and it's highly unlikely the user's build has this enabled either. We only tend to use it when profiling/debugging RTC functions.

Ah, below it's got #ifndef, misread them both as #ifdef.

Robadob commented 2 years ago

Ok I've played around with that function you linked, so #line now only includes a line number. This has switched the warning to outputdata(1): warning: helper_math.h: [jitify] File not found. Which I presume is what you expected.

However, compilation now fails with what would appear to be a propagated syntax error. Prior to this, I successfully compiled the same agent fn without issue. This is very odd.

---------------------------------------------------
--- JIT compile log for outputdata ---
---------------------------------------------------
flamegpu/runtime/messaging/MessageNone/MessageNoneDevice.cuh(23): error: name followed by "::" must be a class or namespace name

flamegpu/runtime/messaging/MessageNone/MessageNoneDevice.cuh(23): error: expected a ")"

flamegpu/runtime/messaging/MessageNone/MessageNoneDevice.cuh(23): error: attribute "__device__" does not apply here

flamegpu/runtime/messaging/MessageNone/MessageNoneDevice.cuh(23): error: name followed by "::" must be a class or namespace name

flamegpu/runtime/messaging/MessageNone/MessageNoneDevice.cuh(23): error: expected an identifier

flamegpu/runtime/messaging/MessageNone/MessageNoneDevice.cuh(23): error: expected a ";"
............
Error limit reached.
100 errors detected in the compilation of "outputdata".
Compilation terminated.
Robadob commented 2 years ago

This is the src being passed to Jitify now. Which builds fine without #include <helper_math.h>, but has an unexpected compilation error with #include <helper_math.h>. It similarly builds fine if #line 1 is removed, or #include <helper_math.h> is commented out.

outputdata
#include "flamegpu/runtime/DeviceAPI.cuh"
#include "flamegpu/runtime/messaging/MessageNone/MessageNoneDevice.cuh"
#include "flamegpu/runtime/messaging/MessageSpatial3D/MessageSpatial3DDevice.cuh"
#line 1
#include <helper_math.h>
FLAMEGPU_AGENT_FUNCTION(outputdata, flamegpu::MessageNone, flamegpu::MessageSpatial3D) {
    // Output each agents publicly visible properties.
    FLAMEGPU->message_out.setVariable<flamegpu::id_t>("id", FLAMEGPU->getID());
    FLAMEGPU->message_out.setVariable<float>("x", FLAMEGPU->getVariable<float>("x"));
    FLAMEGPU->message_out.setVariable<float>("y", FLAMEGPU->getVariable<float>("y"));
    FLAMEGPU->message_out.setVariable<float>("z", FLAMEGPU->getVariable<float>("z"));
    FLAMEGPU->message_out.setVariable<float>("fx", FLAMEGPU->getVariable<float>("fx"));
    FLAMEGPU->message_out.setVariable<float>("fy", FLAMEGPU->getVariable<float>("fy"));
    FLAMEGPU->message_out.setVariable<float>("fz", FLAMEGPU->getVariable<float>("fz"));
    return flamegpu::ALIVE;
}
Robadob commented 2 years ago

Ah I think that makes sense. Jitify is now commenting out the wrong line (because of the #line statement, causing the compilation error to report the wrong line)!

Robadob commented 2 years ago

I guess that's just a side-effect of you using compilation errors to parse headers. Regardless I'll close this PR, I didn't set out to dig into it this, this afternoon. It was just supposed to be a quick passing improvement whilst adding support for users to specify include dirs.

It doesn't seem worth addressing if you're working on an improved header loading algorithm (presumably to address #90).