from nvfuser import FusionDefinition, DataType
import torch
with FusionDefinition() as fd:
t1 = fd.define_tensor(symbolic_sizes=[-1], contiguous=[True], dtype=DataType.Int32)
t2 = fd.ops.round(t1)
fd.add_output(t2)
a = torch.ones(2, device="cuda", dtype=torch.int32)
fd.execute((a,))
CUDA NVRTC compile error: __tmp_kernel1.cu(9175): error: more than one instance of overloaded function "nearbyint" matches the argument list:
function "nearbyint(double)"
__nv_nvrtc_builtin_header.h(156230): here
function "nearbyint(float)"
__nv_nvrtc_builtin_header.h(157258): here
argument types are: (int)
1 error detected in the compilation of "__tmp_kernel1.cu".
🐛 Describe the bug
Versions
devel