csarofeen / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
http://pytorch.org
Other
26 stars 7 forks source link

nearbyint fails with NVRTC compile error for integer inputs #2524

Open IvanYashchuk opened 1 year ago

IvanYashchuk commented 1 year ago

🐛 Describe the bug

from nvfuser import FusionDefinition, DataType
import torch

with FusionDefinition() as fd:
    t1 = fd.define_tensor(symbolic_sizes=[-1], contiguous=[True], dtype=DataType.Int32)
    t2 = fd.ops.round(t1)
    fd.add_output(t2)

a = torch.ones(2, device="cuda", dtype=torch.int32)
fd.execute((a,))
CUDA NVRTC compile error: __tmp_kernel1.cu(9175): error: more than one instance of overloaded function "nearbyint" matches the argument list:
            function "nearbyint(double)"
__nv_nvrtc_builtin_header.h(156230): here
            function "nearbyint(float)"
__nv_nvrtc_builtin_header.h(157258): here
            argument types are: (int)

1 error detected in the compilation of "__tmp_kernel1.cu".

Versions

devel