Open pxl-th opened 1 year ago
atomicrmw fmax/fmin
was added in https://reviews.llvm.org/D127041, which I think came after LLVM 14, so we'll probably have to wait for LLVM 15 support in Julia to get this as a single GCN instruction.
In the meantime, we should add a fallback to a CAS loop in https://github.com/JuliaConcurrent/UnsafeAtomics.jl.
For feature-parity it'd be good to support atomic
max
onFloat
. Currently MWE below fails:Error
```julia ERROR: InvalidIRError: compiling kernel gpu_f(KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, AMDGPU.Device.ROCDeviceVector{Float32, 1}, AMDGPU.Device.ROCDeviceVector{Float32, 1}, AMDGPU.Device.ROCDeviceVector{Int64, 1}) resulted in invalid LLVM IR Reason: unsupported call to an unknown function (call to ijl_get_nth_field_checked) Stacktrace: [1] indexed_iterate @ ./namedtuple.jl:140 [2] multiple call sites @ unknown:0 Reason: unsupported call to an unknown function (call to jl_f_tuple) Stacktrace: [1] indexed_iterate @ ./namedtuple.jl:140 [2] multiple call sites @ unknown:0 Reason: unsupported call to an unknown function (call to ijl_get_nth_field_checked) Stacktrace: [1] atomic_pointermodify @ ~/.julia/packages/LLVM/9gCXO/src/interop/atomics.jl:395 [2] modify! @ ~/.julia/packages/UnsafeAtomicsLLVM/i4GMj/src/internal.jl:18 [3] modify! @ ~/.julia/packages/Atomix/F9VIX/src/core.jl:33 [4] macro expansion @ ~/code/Nerf.jl/src/Nerf.jl:155 [5] gpu_f @ ~/.julia/packages/KernelAbstractions/C8flJ/src/macros.jl:81 [6] gpu_f @ ./none:0 Reason: unsupported dynamic function invocation (call to atomic_pointerreplace) Stacktrace: [1] atomic_pointermodify @ ~/.julia/packages/LLVM/9gCXO/src/interop/atomics.jl:395 [2] modify! @ ~/.julia/packages/UnsafeAtomicsLLVM/i4GMj/src/internal.jl:18 [3] modify! @ ~/.julia/packages/Atomix/F9VIX/src/core.jl:33 [4] macro expansion @ ~/code/Nerf.jl/src/Nerf.jl:155 [5] gpu_f @ ~/.julia/packages/KernelAbstractions/C8flJ/src/macros.jl:81 [6] gpu_f @ ./none:0 ```As a temporary workaround, we can reinterpret
Float32
asUInt32
: