baidu-research / DeepBench

Benchmarking Deep Learning operations on different hardware
Apache License 2.0
1.07k stars 239 forks source link

Add cuDNNv6 results for 1080Ti and TitanXp. #37

Closed bryancatanzaro closed 7 years ago

bryancatanzaro commented 7 years ago

Added recent results for GTX1080Ti and TitanXp.

sharannarang commented 7 years ago

@bryancatanzaro , Thanks for contributing these results!

Some small comments regarding the format:

bryancatanzaro commented 7 years ago

Thanks @sharannarang! I'll send in an updated pull request with 2 separate excel sheets containing the revisions you requested. The all_reduce results should be the same as the TitanX Maxwell results, so I think we'll leave them out if that's ok.

bryancatanzaro commented 7 years ago

@sharannarang I've made the changes you requested. Just to make sure we're agreeing here: by moving from 25 to 50 timesteps for the Vanilla RNN results, reported TFlops go up by 2X. They still look reasonable, so this is probably correct if you've found an error in the script. But just wanted to make sure that's what you were expecting.

sharannarang commented 7 years ago

Thanks @bryancatanzaro . The changes look good.

Regarding the RNN timesteps, the benchmark always prints 25 timesteps here:

https://github.com/baidu-research/DeepBench/blob/master/code/nvidia/rnn_bench.cu#L303

RNN kernels have been specified with 50 timesteps in the problem set, so it's just a printing error. As you mentioned the reported TFlops look reasonable.

I'll merge this PR.