Open frankmanbb opened 8 years ago
Intel is still working on optimizing some kernels for backward pass. These are kernels from our speech recognition pipeline and they haven't optimized them as yet.
I'll ping someone from Intel to provide more details.
Thank you for your reply. Hope we can get that soon.
I chatted with Intel regarding this. These optimizations will be available as a part of MKL 2017's new version which will be out in a few weeks.
according to benchmark result, mkl's deep learning with convolution (not gemm) has a much slower backward speed than the forward pass.
for example , for W=341, H=79,C=32,N=4, K=32, R=5, S=10, in KNL7250 platform, forward 0.91ms, backward with input is 68.79 ms, with weight is 74.98 ms! so backward is 68 times slower than forward.
as a comparison, in titanx, forward is 0.74ms, backward with input is 3.09 ms, with weight is 0.76 ms. For forward, KNL7250 is only a little slower than titanx , but for backward, KNL7250 is much much slower. This is similar with other W,H,C configuration.
can any one give me the reason? is it because mkl has not made much optimization for backward yet?