soumith / cudnn.torch

Torch-7 FFI bindings for NVIDIA CuDNN
BSD 2-Clause "Simplified" License
409 stars 157 forks source link

cudnn.deterministic mode #270

Open szagoruyko opened 8 years ago

szagoruyko commented 8 years ago

I see that it's not clear sometimes how to enable deterministic mode, should we maybe instead of setMode have a global boolean cudnn.deterministic like cudnn.benchmark or cudnn.fastest ? Would work with functional too. In case of MaxPooling we'd have to fall back to THNN, as far as I remember cudnn maxpooling is not deterministic.

jpuigcerver commented 8 years ago

I would argue that if you want absolutely deterministic results, use cudnn.convert(model, nn). NVIDIA cuDNN is not deterministic, so I think it's better not to give "false expectations" to the users of cudnn.torch.

szagoruyko commented 8 years ago

@jpuigcerver cudnn has deterministic algorithms, see the manual

jpuigcerver commented 8 years ago

Copy & paste from the cuDNN 5.1 user guide:

However, bit-wise reproducibility is not guaranteed across versions, as the implementation of a given routine may change. With the current release, the following routines do not guarantee reproducibility because they use atomic operations:
‣ cudnnConvolutionBackwardFilter when CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0 or CUDNN_CONVOLUTION_BWD_FILTER_ALGO_3 is used
‣ cudnnConvolutionBackwardData when CUDNN_CONVOLUTION_BWD_DATA_ALGO_0 is used
‣ cudnnPoolingBackward when CUDNN_POOLING_MAX is used
‣ cudnnSpatialTfSamplerBackward

So, for instance, for training, using cudnn.SpatialMaxPooling() and setting cudnn.deterministic = true is impossible.

szagoruyko commented 8 years ago

@jpuigcerver that's why I mentioned that we'd have to fall back to THNN or assert in MaxPooling. I think determinism within one version is good enough to have already.

ngimel commented 8 years ago

@jpuigcerver Just to be clear, MaxPooling is nondeterministic only when pooling stride is less than pooling window.