NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.17k stars 1.35k forks source link

Increase tolerance to workaround unit test failures on A100 #1766

Closed nWEIdia closed 6 months ago

nWEIdia commented 6 months ago

failures happen with absolute difference of ~0.001190185546875 and relative diff of ~0.0306854248046875.