Open mys007 opened 9 years ago
this is true. I think I can relax this constraint more.
i will fix it when I get time. In the meanwhile, pull requests are welcome :)
From NVIDIA:
3D support has been added for all layers in CuDNN v3 RC.
The story with non-contiguous tensors is somewhat complicated. Short answer is, cuDNN will return CUDNN_STATUS_NOT_SUPPORTED if you attempt to call some routine with the tensor format that is does not support.
Support matrix for the padding/transposition is as follows:
SUPPORT OPTIMIZED
FORWARD:
Algo0 : all NCHW , W-packed
Algo1 all NCHW , W-packed
Algo2 all ?
FFT NCHW HW-packed NCHW HW-packed
WGRAD
Algo0 NCHW CHW-packed
Algo1 NCHW CHW packed
FFT NCHW HW-packed
DGRAD
Algo0 NCHW CHW packed
Algo1 NCHW CHW packed
FFT NCHW HW-packed
Meaning that to get the best performance on gemm-based forward propagation you want to have NCHW contiguous tensor. Transposition/padding is supported, but performance is not guaranteed. FFT for both forward and backprop supports padding in C and N dimensions, but no transpositions. CHW-packed means that you can not have transpositions and padding in C,H,W dimensions, but can have padding in N (outermost) dimension. Non-convolutional operators should support any strides for input and output, please file a bug if they do not.
The requirements for contiguous tensors seem to be too strict with R3. Actually, cudnn supports operations on non-contiguous data well; there are just some constraints, typically that input and gradInput and output and gradOutput needs to have the same strides. Thus, I was able to extend Pointwise (ReLU) to work with non-contiguous tensors. Basically one just needs to create personalized descriptors for every tensor and not share them.