Open tgross35 opened 5 months ago
@rustbot label +A-simd +T-libs +F-f16_and_f128 +E-help-wanted +C-feature-request -needs-triage
Nvidia ptx (--target nvptx64-nvidia-cuda
) also support arithmetic instructions for f16
and f16x2
SIMD.
Making this work is an important step for making the ptx target "feature complete" with languages traditionally used for GPGPU. Let me know if there's anything I can do to support this.
Thanks, I'll add that to the top list.
It looks like it might not be too hard to add new simd intrinsics on that platform? I have no clue but https://github.com/rust-lang/stdarch/blob/df3618d9f35165f4bc548114e511c49c29e1fd9b/crates/core_arch/src/nvptx/mod.rs is pretty straightforward if you want to give it a shot at some point
I just tested f16
on nvptx now and I don't think I realized how many of the pieces was already put together. That's great!
I looked a bit around in SIMD instructions for other arches and I think this is, as you say, pretty straightforward. I will give it a shot. Hopefully I will get around to creating a PR next week.
That is great news! Note that unfortunately math symbols aren’t yet available on all targets so testing with the new types is kind of weird sometimes, but hopefully that will be resolved in a week or so with a compiler_builtins update.
Took me a bit longer than I originally hoped for but I ended up creating a PR for (most) nvptx f16x2
intrinsics and getting it merged. https://github.com/rust-lang/stdarch/pull/1626
I have also noticed that we're lacking portable_simd
variants of f16
and it's not being tracked by this issue. Is that outside the scope of this issue or just not added yet? Is anyone already coordinating with the portable_simd project or is it simply being blocked by other features that needs to land first?
Took me a bit longer than I originally hoped for but I ended up creating a PR for (most) nvptx
f16x2
intrinsics and getting it merged. rust-lang/stdarch#1626
Awesome news, thanks for the update! Looks like there is an open PR to get the new changes https://github.com/rust-lang/rust/pull/128866.
I have also noticed that we're lacking
portable_simd
variants off16
and it's not being tracked by this issue. Is that outside the scope of this issue or just not added yet? Is anyone already coordinating with the portable_simd project or is it simply being blocked by other features that needs to land first?
I'll add it to the issue, no particular reason outside of being lower priority than the intrinsics.
Eventually we will want to be able to make use of simd operations for f16 and f128, now that we have primitives to represent them. Possibilities that I know of:
float16x{4,8}
https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&f:@navigationhierarchiesreturnbasetype=[float]&f:@navigationhierarchieselementbitsize=[16]&q=.float16x{1,2}
https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiesreturnbasetype=[float]&f:@navigationhierarchieselementbitsize=[16]&f:@navigationhierarchiessimdisa=[sve2,sve]&q=Probably some work/research overlap with adding assembly https://github.com/rust-lang/rust/issues/125398
Tracking issue: https://github.com/rust-lang/rust/issues/116909