Open RicardoDominguez opened 6 years ago
For Theano: (source)
Better on GPU: matrix multiplication, convolution, and large element-wise operations can be accelerated a lot (5-50x) when arguments are large enough to keep 30 processors busy.
Equal on GPU/CPU: indexing, dimension-shuffling and constant-time reshaping.
Better on CPU: summation over rows/columns of tensors.
Copying of large quantities of data to and from a device is relatively slow, and often cancels most of the advantage of one or two accelerated functions on that data. Getting GPU performance largely hinges on making data transfer to the device pay off. By default all inputs will get transferred to GPU. You can prevent an input from getting transferred by setting its tag.target attribute to ‘cpu’.
Tips for Improving Performance on GPU
Make available optional use of GPU acceleration.