Closed charlielam0615 closed 3 years ago
Hey,
thanks for your report. My first guess for those discrepancies would be floating precision. Could you provide the relative mismatch between mismatching values by adding something like
for lib_out, torch_out in zip(lib_outputs.flatten(), torch_outputs.flatten()):
if not torch.allclose(lib_out, torch_out):
print(lib_out, torch_out)
to your example?
It looks like your guess is right. Here's part of the output.
...
tensor(-0.7282) tensor(-0.7282)
tensor(0.5448) tensor(0.5448)
tensor(-0.2942) tensor(-0.2942)
tensor(0.0671) tensor(0.0671)
tensor(0.0840) tensor(0.0840)
tensor(-1.2255) tensor(-1.2254)
tensor(0.5453) tensor(0.5453)
tensor(0.1272) tensor(0.1272)
tensor(0.0616) tensor(0.0616)
tensor(0.0478) tensor(0.0478)
tensor(-0.5128) tensor(-0.5128)
tensor(0.0550) tensor(0.0550)
tensor(-0.8851) tensor(-0.8851)
tensor(0.5813) tensor(0.5813)
tensor(-0.1910) tensor(-0.1910)
tensor(-0.4936) tensor(-0.4936)
tensor(-0.0498) tensor(-0.0498)
...
They all look the same because of the printing decimal setting. Strange that the floating precision discrepancy could lead to this much of a difference (max difference 0.0001), though.
I have already experienced such behavior multiple times. One fix is to increase the tolerances in torch.allclose
, e.g. using rtol=5e-05, atol=1e-07
instead of the default settings.
Thanks for sharing this library! I try to extend the unfold version of conv2d to conv3d, the output difference seems be to not negligible albeit small. Any clues why this happens?
Outputs: