A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.17k
stars
1.35k
forks
source link
Increase tolerance to workaround unit test failures on A100 #1766
Closed
nWEIdia closed 6 months ago
failures happen with absolute difference of ~0.001190185546875 and relative diff of ~0.0306854248046875.