Open artbataev opened 6 years ago
We investigated this and Spandan and Bowen found that the calculation of new convolve geometry was the culprit for the slowdown. If the input size is fixed, cntk only calculate this once, but with the variable sizes, it needs to calculate the geometry for each forward call.
Thanks for the reply. Is it possible to speed up these calculations? I think it is a significant and unexpected drawback: the network is rather big, and input size calculation seems to be not very expensive compared to convolution computation.
The slowdown happens due to the division and mod operations that are performed in calculating geometry. So avoiding the variable sizes could be the best option for now.
While using variable length input with CNN, performance significantly decreases compared to fixed length input. Why?
To reproduce the result I've created simple network. I use one CPU kernel (CPU usage is limited with taskset), but practically the same result is observed on GPU.
Using
is 62% slower (1.863 sec / batch) than
(1.153 sec / batch)
System information: