Open maleadt opened 3 years ago
Alternatively, we could build compiler-rt for NVPTX and ship that in CUDA.jl like we do with libdevice.
Another MWE from https://github.com/JuliaGPU/CUDA.jl/issues/793:
julia> using CUDA
julia> A = zeros(3) |> CuArray
3-element CuArray{Float64, 1}:
0.0
0.0
0.0
julia> A .= UInt128(5)
CUDA GPUs do not natively support Int128 operations. LLVM supports lowering code that works with Int128, https://reviews.llvm.org/rGb9fc48da832654a2b486adaa790ceaa6dba94455, but requires compiler intrinsics for many operations:
With https://reviews.llvm.org/D34708, it should be possible to resolve those intrinsics in the current module, so we can just add them to our runtime library.