device convolution experiments

This page explains the experiments we plan to finish.

Test file

https://github.com/fengggli/gpu-computing-materials/blob/e32df713f60dd1e195d7413142d76e29086ca0dc/tests/bench_conv_device.cpp
we need move the allocation of cudnn device memory to outside of the forward/backward, @fengggli will do that later

Important

Make sure you have -DCMAKE_BUILD_TYPE=Relase in your final experiments
Run all experiments in sievert.cs.iupui.edu; (we can change this but lets keep it as our current plan)
You can post your results(and update) in here, so that we can compare.

Sample input/output

(py36) lifen@sievert(:):~/Workspace/gpu-computing-materials/build_cuda$./tests/bench-conv-device  &> result.txt
(py36) lifen@sievert(:):~/Workspace/gpu-computing-materials/build_cuda$less result.txt |grep stat-cudnn                                                                                                                                       
stat-cudnn      1       4       32      1       3       17.425  3.513
stat-cudnn      1       4       32      4       3       5.626   5.027
stat-cudnn      1       4       32      16      3       13.015  9.791
stat-cudnn      4       4       32      1       3       5.683   5.260
stat-cudnn      4       4       32      4       3       13.332  10.296
stat-cudnn      4       4       32      16      3       42.737  29.357
stat-cudnn      16      4       32      1       3       13.562  11.304
stat-cudnn      16      4       32      4       3       42.980  30.230
stat-cudnn      16      4       32      16      3       160.499 106.413

fengggli / gpu-computing-materials

device convolution experiments #48

Test file

Important

Sample input/output