Closed kikaxa closed 7 years ago
it will end up as best anyway, but after this patch, we won't spend time, power and memory on trying other, potentially very slow, kernels.
Fair point.
Can you submit as a branch and/or pull request?
sorry, i don't use git anywhere at the time, but can do this for you, if you would like to.
Hi @hughperkins,
When I use deepcl on intel gpus, i often get "0ms" timings when auto-optimizing layers.
This patch directs deepcl to use the first instance with "0ms" timing it encounters as best.
For me, this patch removes 1.7s of "startup" time out of 5s run-time.
example log:
patches: