NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
BSD 3-Clause "New" or "Revised" License
8.42k stars 1.4k forks source link

Increase tolerance to workaround unit test failures on A100 #1766

Closed nWEIdia closed 10 months ago

nWEIdia commented 10 months ago

failures happen with absolute difference of ~0.001190185546875 and relative diff of ~0.0306854248046875.