yogevb / a-dda

Automatically exported from code.google.com/p/a-dda
0 stars 0 forks source link

Timing of OpenCL version is sometimes inadequate #130

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
When running adda_ocl (without precise timing), clock() is used to measure the 
timing (except the first value - "Total wall time". The clock() is supposed to 
measure processor time and it may include GPU time or not.

In particular, attached are two log files from running
adda -grid 128 -m 1.05 0
on Fedora 14 with Nvidia GTX570 using two driver versions (260.x from XOrg 
driver package, and 270.x directly from Nvidia).

When using 260.x driver, timing is inadequate, as indicated by large difference 
between "Total wall time" and "Total time". The main problem is that shown time 
per one iteration is several times smaller than in reality (because GPU takes 
most of the time). When using 270.x, everything is fine, since clock() counts 
the GPU time as well. 

So first, it is necessary to investigate this issue in details. The possible 
solutions will be to either change clock() to something else (this is related 
to issue 43 and is hard to make portable) or at least to add corresponding 
warnings to the manual, etc.

Original issue reported on code.google.com by yurkin on 10 Jun 2011 at 3:10

Attachments:

GoogleCodeExporter commented 9 years ago
This problem is not visible for the latest drivers. Moreover, I have added a 
brief warning to the manual. It may be interesting to implement precise GPU 
timer, but that would be a completely different issue.

Original comment by yurkin on 16 Apr 2012 at 9:31