NVIDIA / NVTX

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.
Apache License 2.0
294 stars 47 forks source link

`NVTX3_CPP_REQUIRE_EXPLICIT_VERSION` is problematic in header-only libraries #93

Open bernhardmgruber opened 5 months ago

bernhardmgruber commented 5 months ago

Hi, the documentation for NVTX3_CPP_REQUIRE_EXPLICIT_VERSION in the nvtx3.hpp header containing the C++ API explains the following:

... the recommended best practice for instrumenting header-based libraries with NVTX C++ Wrappers is is to #define NVTX3_CPP_REQUIRE_EXPLICIT_VERSION before including nvtx3.hpp, #undef it afterward, and only use explicit-version symbols.

However, this breaks user code using the unversioned API directly.

For example:

#include <my_library.hpp> // includes NVTX3
#include <nvtx3/nvtx3.hpp> // user also includes NVTX3

int main() {
  nvtx3::scoped_range domain; // user uses unversioned API
}

If my_library.hpp now changes from

#include <nvtx3/nvtx3.hpp>

to

#define NVTX3_CPP_REQUIRE_EXPLICIT_VERSION
#include <nvtx3/nvtx3.hpp>
#undef NVTX3_CPP_REQUIRE_EXPLICIT_VERSION

the above user program breaks, because the unversioned API is no longer provided.

This happens in the second case because the first inclusion of nvtx3.hpp defines NVTX3_CPP_DEFINITIONS_V1_0 and emits the symbols into namespace nvtx3::v1 and v1 is not an inline namespace because NVTX3_CPP_REQUIRE_EXPLICIT_VERSION is defined. The second inclusion will then see that NVTX3_CPP_DEFINITIONS_V1_0 is already defined and not provide the unversioned API (e.g., by inlining the v1 namespace).

We observed this behavior at the following PR to CCCL/CUB: https://github.com/NVIDIA/cccl/pull/1688

I therefore think that header-only libraries must not define NVTX3_CPP_REQUIRE_EXPLICIT_VERSION to not break user code. Please correct me if I am wrong. Otherwise, I would kindly ask you to update the guidance provided by the documentation.

bernhardmgruber commented 5 months ago

We discovered a further problematic example. Assume the following situation:

my_library.hpp uses NVTX like:

#include <nvtx3/nvtx3.hpp>
...
nvtx3::scoped_range

But the user requests the explicit version:

#define NVTX3_CPP_REQUIRE_EXPLICIT_VERSION
#include <nvtx3/nvtx3.hpp>
#include <my_library.hpp>

int main() {
 nvtx3::v1::scoped_range
}

This also fails compilation.

The problem basically is that we cannot have the explicit and non-explicit API of the same version in the same TU. What is your guidance to resolve this? Thx!