minnervva / torchdetscan

This is a tool for finding non-deterministic functions in your pytorch code.
https://github.com/minnervva/torchdetscan
MIT License
1 stars 0 forks source link

summarize discussion about reduction algortihm, determinism etc... #20

Open mtaillefumier opened 7 months ago

mtaillefumier commented 7 months ago

we had long discussion on the 29th about the reduction algorithm that extended to the two versions of determinism, the strong vs weak implementation.

refs : https://groups.csail.mit.edu/commit/papers/09/asplos073-olszewski.pdf https://onlinelibrary.wiley.com/doi/full/10.1002/cpe.7763 https://www.physicsforums.com/threads/parallel-kahan-summation.716446/ (Kahan sum in parallel) https://indico.fnal.gov/event/57249/contributions/270632/attachments/169825/228082/reproduciblity_lattice2023.pdf (kate talk) https://developer.download.nvidia.com/assets/cuda/files/reduction.pdf (Mark harris talk about reduction, perf oriented)

we must distinguish between reproducibility and error minimization.