hughperkins / cltorch

An OpenCL backend for torch.
Other
289 stars 26 forks source link

Are there any tunables that affect performance on different GPUs ? #71

Closed manjunaths closed 8 years ago

manjunaths commented 8 years ago

Greetings,

I am trying to run cltorch on some OpenCL enabled devices. But the performance numbers are below my expectations. Are there any tunables(/magic numbers/variables) in the code of either clnn or cltorch that I can change to see if that makes a difference performance ?

Thanks.

hughperkins commented 8 years ago

the performance numbers are below my expectations

on an nvidia device, generally speaking, one can get within about 2-4 times slower than the cuda version. If you're slower than this, then there might be some optimization possible. If you're within this regime, getting faster than this is pretty hard.

Are there any tunables(/magic numbers/variables)

There's not like a slider from 'slow' to 'fast', that you can move, as such :-) Any optimizations will be on a case-by-case, layer-by-layer basis, and will generally involve code changes.

You can by the way see which layers are using most of your time by using cltorch.setTiming(1) and cltorch.dumpTimings().