Closed manjunaths closed 8 years ago
the performance numbers are below my expectations
on an nvidia device, generally speaking, one can get within about 2-4 times slower than the cuda version. If you're slower than this, then there might be some optimization possible. If you're within this regime, getting faster than this is pretty hard.
Are there any tunables(/magic numbers/variables)
There's not like a slider from 'slow' to 'fast', that you can move, as such :-) Any optimizations will be on a case-by-case, layer-by-layer basis, and will generally involve code changes.
You can by the way see which layers are using most of your time by using cltorch.setTiming(1)
and cltorch.dumpTimings()
.
Greetings,
I am trying to run cltorch on some OpenCL enabled devices. But the performance numbers are below my expectations. Are there any tunables(/magic numbers/variables) in the code of either clnn or cltorch that I can change to see if that makes a difference performance ?
Thanks.