NVIDIA / cccl

CUDA Core Compute Libraries
https://nvidia.github.io/cccl/
Other
1.25k stars 160 forks source link

[EPIC]: Reproducible floating-point reductions #1558

Open jrhemstad opened 7 months ago

jrhemstad commented 7 months ago

Is this a duplicate?

Area

CUB

Is your feature request related to a problem? Please describe.

I would like reproducible reductions for floating-point values.

Describe the solution you'd like

@maddyscientist has a proof-of-concept implementation here: https://github.com/maddyscientist/reproducible_floating_sums/tree/feature/cuda

MVP:

Future work:

Describe alternatives you've considered

No response

Additional context

No response

### Tasks
- [ ] https://github.com/NVIDIA/cccl/issues/2119
- [ ] https://github.com/NVIDIA/cccl/issues/2112
- [ ] https://github.com/NVIDIA/cccl/issues/2120
- [ ] https://github.com/NVIDIA/cccl/issues/2121
- [ ] https://github.com/NVIDIA/cccl/issues/2122
- [ ] https://github.com/NVIDIA/cccl/issues/2125
- [ ] https://github.com/NVIDIA/cccl/issues/2124
- [ ] https://github.com/NVIDIA/cccl/issues/2123
- [ ] https://github.com/NVIDIA/cccl/issues/2126
jrhemstad commented 7 months ago

This algorithm is discussed in @maddyscientist's GTC talk here: https://www.nvidia.com/gtc/session-catalog/?search=S62405&tab.allsessions=1700692987788001F1cG#/session/1696035956217001KKhw