baidu-research / DeepBench

Benchmarking Deep Learning operations on different hardware
Apache License 2.0
1.07k stars 239 forks source link

why mkl has much slower for backward than forward? #2

Open frankmanbb opened 8 years ago

frankmanbb commented 8 years ago

according to benchmark result, mkl's deep learning with convolution (not gemm) has a much slower backward speed than the forward pass.

for example , for W=341, H=79,C=32,N=4, K=32, R=5, S=10, in KNL7250 platform, forward 0.91ms, backward with input is 68.79 ms, with weight is 74.98 ms! so backward is 68 times slower than forward.

as a comparison, in titanx, forward is 0.74ms, backward with input is 3.09 ms, with weight is 0.76 ms. For forward, KNL7250 is only a little slower than titanx , but for backward, KNL7250 is much much slower. This is similar with other W,H,C configuration.

can any one give me the reason? is it because mkl has not made much optimization for backward yet?

sharannarang commented 8 years ago

Intel is still working on optimizing some kernels for backward pass. These are kernels from our speech recognition pipeline and they haven't optimized them as yet.

I'll ping someone from Intel to provide more details.

frankmanbb commented 8 years ago

Thank you for your reply. Hope we can get that soon.

sharannarang commented 8 years ago

I chatted with Intel regarding this. These optimizations will be available as a part of MKL 2017's new version which will be out in a few weeks.