Closed kririae closed 2 months ago
The definition of EmptyKernel
after macro expansion is:
template <typename T>
__global__ __attribute__((__visibility__("hidden"))) void EmptyKernel()
{}
which is the same for many other kernels with have inside CUB. The kernel being a template makes the function EmptyKernel
implicitely inline
, so it's fine to define it in the header and the linker should discard all but one compiled version. This works correctly as part of the many compilers we test in our CI.
I am not familiar with "the OptiX linker" so here is some speculation: maybe it fails to mark the kernel inline
because the template parameter is not used. Can you change the kernel to
template <typename T>
CUB_DETAIL_KERNEL_ATTRIBUTES void EmptyKernel(T*)
{}
and the later definition of EmptyKernelPtr
to:
using EmptyKernelPtr = void (*)(void*);
and report whether this fixes the issue?
Thank you for your patient reply! I've tried your suggested approach, but unfortunately, the renderer still reports the same error:
Error: Symbol '_ZN3cub17CUB_200400_860_NS11EmptyKernelIvEEvPT_' was defined multiple times. First seen in: '0x71864e8064b77f71__closesthit__radiance-and-6-more'
I'd like to provide some additional information that might be helpful:
v2.4.0
. We previously used an older version without the visibility attribute (sorry about not mentioning that..), but the issue persists after upgrading to 2.4.0. Here's the PTX output of the corresponding section for the current CUB version:// .weak _ZN3cub17CUB_200400_860_NS11EmptyKernelIvEEvPT_
.weak .entry _ZN3cub17CUB_200400_860_NS11EmptyKernelIvEEvPT_(
.param .u64 _ZN3cub17CUB_200400_860_NS11EmptyKernelIvEEvPT__param_0
)
{
.loc 2 284 0
.loc 2 284 73
ret;
}
.entry __nv_static_37__4fac033d_15_bsdf_builtin_cu_c160a2af__ZN3cub17CUB_200400_860_NS48_GLOBAL__N__4fac033d_15_bsdf_builtin_cu_c160a2af11EmptyKernelIvEEvPT_(
.param .u64 __nv_static_37__4fac033d_15_bsdf_builtin_cu_c160a2af__ZN3cub17CUB_200400_860_NS48_GLOBAL__N__4fac033d_15_bsdf_builtin_cu_c160a2af11EmptyKernelIvEEvPT__param_0
)
{
.loc 2 285 0
.loc 2 285 73
ret;
}
Interestingly, when compiling to OptiX IR (-optix-ir), the linker issue disappears without adding the anonymous namespace. This leads me to suspect that.. it might be an OptiX-specific issue.
I'm not sure if the visibility attribute is actually discarded... compiling the v2.4.0
reports no warning but when I switched back to older version and add the attribute manually, the compiler reports that
[7/10] Building CUDA object flux/CMakeFiles/flux.dir/src/film.cu.o
/home/krr/.cache/CPM/cccl/6b735c39125bb1ca2f7ad9b8fbf6b135f24dd2ed/cub/cub/cmake/../../cub/util_device.cuh:117:88: warning: ‘visibility’ attribute ignored [-Wattributes]
117 | __global__ __attribute__((__visibility__("hidden"))) void EmptyKernel(void) { }
| ^
In file included from tmpxft_001ac178_00000000-6_film.cudafe1.stub.c:1:
/tmp/tmpxft_001ac178_00000000-6_film.cudafe1.stub.c:1:115: warning: ‘visibility’ attribute ignored [-Wattributes]
1 | #pragma GCC diagnostic push
| ^
[8/10] Building CUDA object flux/CMakeFiles/flux.dir/src/light.cu.o
/home/krr/.cache/CPM/cccl/6b735c39125bb1ca2f7ad9b8fbf6b135f24dd2ed/cub/cub/cmake/../../cub/util_device.cuh:117:88: warning: ‘visibility’ attribute ignored [-Wattributes]
117 | __global__ __attribute__((__visibility__("hidden"))) void EmptyKernel(void) { }
The two files are not related to OptiX kernel.
In any case, we do have a workaround, and it doesn't affect our development (hopefully?). Thank you again for your reply.
Here's the PTX output of the corresponding section for the current CUB version:
// .weak _ZN3cub17CUB_200400_860_NS11EmptyKernelIvEEvPT_ .weak .entry _ZN3cub17CUB_200400_860_NS11EmptyKernelIvEEvPT_(
The .weak
is a strong indicator that the compiler understood the function to be inline
. The linker should then discard all but one of the weak symbols. There may be indeed a problem with the linker.
- When using the anonymous namespace approach, we get the following PTX (and this works under OptiX):
.entry __nv_static_37__4fac033d_15_bsdf_builtin_cu_c160a2af__ZN3cub17CUB_200400_860_NS48_GLOBAL__N__4fac033d_15_bsdf_builtin_cu_c160a2af11EmptyKernelIvEEvPT_(
If I demangle that, I get .entry __nv_static_37__4fac033d_15_bsdf_builtin_cu_c160a2af_void cub::CUB_200400_860_NS::(anonymous namespace)::EmptyKernel<void>(void*)(
. I don't know what the compiler makes of entities in an anonymous namespace, but __nv_static_37__4fac033d_15_bsdf_builtin_cu_c160a2af_
sounds like it generated a random enough symbol name that should not conflict with other random enough symbols of other translation units. Anonymous namespaces have a similar effect than marking a symbol static
, which is also hinted at by __nv_static_
.
So I guess that's your workaround for now. Anonymous namespace or mark as static
. The downside is that you will have many versions of EmptyKernel
in your binary, so this solution may not scale to big kernels in programs with many translation units.
This leads me to suspect that.. it might be an OptiX-specific issue.
If you can, please report this issue to the OptiX team. Thank you!
This leads me to suspect that.. it might be an OptiX-specific issue.
I've already been in touch with the internal OptiX team and they are looking into it. Early indications make it sound like this is indeed an OptiX issue setting the wrong linkage.
Thank you for your explanation and help, hope to see the issue gets figured out and fixed!
Thank you for your explanation and help, hope to see the issue gets figured out and fixed!
Hey @kririae, I heard from the internal OptiX team that this issue has been fixed in the development branch for the OptiX linker. They weren't sure on exact dates of when the fix will be public, but it should be relatively soon!
Thank you for your explanation and help, hope to see the issue gets figured out and fixed!
Hey @kririae, I heard from the internal OptiX team that this issue has been fixed in the development branch for the OptiX linker. They weren't sure on exact dates of when the fix will be public, but it should be relatively soon!
Happy to see that! Again thank you for your patience and help.
Is this a duplicate?
Type of Bug
Runtime Error
Component
CUB
Describe the bug
When more than one TU is compiled with CUB's
cub/util_device.cuh
header included, the OptiX linker will report an error upon the invocation of optixPipelineCreate, which reports the following:How to Reproduce
The reproduction of this error(?) requires a complete OptiX environment, I forked the official OptiX_Apps as an example.
I can only use the PTX variant(instead of the OptiX IR variant) of OptiX because of this issue, so I slightly modified the code as an alignment.
When executing the example renderer like:
The issue appears
Expected behavior
I temporarily worked around this issue by adding an anonymous namespace around, and the issue disappeared.
Reproduction link
https://github.com/kririae/OptiX_Apps/commit/8ee41786adf4ca23c7681aacd3c4abba3d7d4afb
Operating System
Arch Linux
nvidia-smi output
NVCC version