Open szagoruyko opened 8 years ago
I would argue that if you want absolutely deterministic results, use cudnn.convert(model, nn)
. NVIDIA cuDNN is not deterministic, so I think it's better not to give "false expectations" to the users of cudnn.torch.
@jpuigcerver cudnn has deterministic algorithms, see the manual
Copy & paste from the cuDNN 5.1 user guide:
However, bit-wise reproducibility is not guaranteed across versions, as the implementation of a given routine may change. With the current release, the following routines do not guarantee reproducibility because they use atomic operations:
‣ cudnnConvolutionBackwardFilter when CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0 or CUDNN_CONVOLUTION_BWD_FILTER_ALGO_3 is used
‣ cudnnConvolutionBackwardData when CUDNN_CONVOLUTION_BWD_DATA_ALGO_0 is used
‣ cudnnPoolingBackward when CUDNN_POOLING_MAX is used
‣ cudnnSpatialTfSamplerBackward
So, for instance, for training, using cudnn.SpatialMaxPooling()
and setting cudnn.deterministic = true
is impossible.
@jpuigcerver that's why I mentioned that we'd have to fall back to THNN or assert in MaxPooling. I think determinism within one version is good enough to have already.
@jpuigcerver Just to be clear, MaxPooling is nondeterministic only when pooling stride is less than pooling window.
I see that it's not clear sometimes how to enable deterministic mode, should we maybe instead of
setMode
have a global booleancudnn.deterministic
likecudnn.benchmark
orcudnn.fastest
? Would work with functional too. In case of MaxPooling we'd have to fall back to THNN, as far as I remember cudnn maxpooling is not deterministic.