improve timers throughout the code

GoogleCodeExporter commented 8 years ago

Description From Bob Marinier 2008-06-13 11:16:14 (-) [reply]
Time sources are used in various places in Soar and SML, including:

1) Timers in the kernel itself, used to keep track of how long various 
parts of
the code take to run (as reported by the stats command).

2) The "time" command, which reports how long a given command takes to 
execute.

3) In soar_thread::OSSpecificTimer (added on the gski-rem branch; used to
profile connections and by TestSMLPerformance).

Each of these uses a different mechanism for tracking time, with different
resolutions and consequences.  For example, clock() has a low resolution, 
and
it measures real time (as opposed to process time).  In contrast, the 
Windows
system call QueryPerformanceCounter is high resolution and measures process
time (and thus is not affected by things like multi-processor systems and 
power
saving features that change the processor frequency).  Ideally, we want to 
use
the same timing mechanisms throughout all the code, and it should be high
resolution.  It should measure real time or processor time as appropriate
(e.g., the "time" command claims to report both).

Here's how times are currently tracked by various parts of the code:

1) Kernel times are tracked using getrusage on linux, and GetProcessTimes 
on
windows.  These both have low resolution and high call cost.

2) The "time" command reports processor time and real time.  For processor
time, it uses gettimeofday on linux and GetTickCount on windows.  For real 
time
it uses clock.  As far as I can tell, all of these report real time (so the
results of this command are incorrect).  Clock and GetTickCount are low
resolution. gettimeofday is affected by anything that changes the system 
clock
(e.g., NTP services), and is also affected by anything that changes the
processor frequency (e.g., power saving modes).

3) soar_thread::OSSpecificTimer uses gettimeofday on linux (with the 
problems
as noted above) and QueryPerformanceCounter on windows, which has a high 
call
cost, but is the highest resolution timer available on windows, and isn't
affected by power saving features or changes to the system time, and 
shouldn't
be affected by multiprocessor/core systems (barring BIOS bugs).

Choosing the "best" time source:  It seems like there is no single time 
source
that has no disadvantages.  Furthermore, there is no cross-platform time
source.  On Windows, QueryPerformanceCounter is probably the best choice: 
it
has the best resolution, and the overhead shouldn't be any worse than what 
we
already have.  On Linux, the highest resolution seems to be gettimeofday, 
which
is unfortunate given all of its drawbacks.  A possible alternative is to 
use
RDTSC (assembly instruction), but this is only available on Intel machines, 
and
suffers from various problems (see the MS and Intel references below).  
Still,
it may be the best we can do on Linux.  For process time (as opposed to 
real
time), it looks like we're stuck with GetProcessTimes and getrusage.

Regardless of the decisions we make here, we should have a single class for
getting time information that is used by all of these different places.  I
recommend using the existing soar_thread::OSSpecificTimer (which will have 
to
be moved out of ConnectionSML, but the thread stuff doesn't really belong 
there
anyway).  Then as we learn more we can update the code in one place, and
everything will be consistent.

For places where the code runs so fast that timing is useless (e.g., timing
some of Soar's phases), it may make sense to disable the timers altogether.  
If
the code appears to be running slow, people can turn the timers back on to
investigate.  As it is, timers introduce significant overhead in the kernel
(~25% in the towers-of-hanoi demo, as measured with timers --off.  
Admittedly,
more complex code will be less affected).

References:

Increasing Timing Resolution for Processes and Threads in Linux
http://ww2.cs.fsu.edu/~hines/present/timing_linux.pdf

Win32 Performance Measurement Options
http://www.ddj.com/windows/184416651
In particular: http://www.ddj.com/showArticle.jhtml?
documentID=win0305a&pgno=18

Thread on gettimeofday resolution
http://kerneltrap.org/mailarchive/linux-kernel/2008/4/10/1392304

Game Timing and Multicore Processors
http://msdn.microsoft.com/en-us/library/bb173458(VS.85).aspx

Measure Code Sections Using The Enhanced Timer
http://softwarecommunity.intel.com/articles/eng/2589.htm
------- Comment #1 From Bob Marinier 2008-06-15 14:40:43 (-) [reply] ------
-
Another library to consider is boost::date_time.  This uses gettimeofday on
linux and ftime on windows.  Both report real time and not process time.

I explored the effects of making the timers in the kernel itself header-
only
(to promote inlining).  This made no difference in the times.  However, 
running
TestSoarPerformance with "timers --off" resulted in a 20% savings, which is
consistent with the earlier test.  Running TestSMLPerformance with "timers
--off" resulted in a 6-8% savings, which shows that they have an impact 
even
when the kernel isn't doing anything.

Given that timers appear to be inherently expensive to use, especially 
process
timers, our best option may be to extend the timers command to give the 
user
more control over which timers are enabled.

Currently, they can all be turned on or off.  In my experience, people 
often
use the kernel and total times reported by stats, but only look at the 
phase
times if there is a problem.  In the absence of a problem, the phase times 
are
often lower than the resolution of the clock anyway, and thus don't report
meaningful data.

An improvement would be to allow kernel and total timers to be on, but all
other timers to be turned off.  (Arguably, the flexibility to manipulate
individual timers would be useful, but it seems like this would rarely be
used).  Furthermore, the stats command would report that a timer is off 
rather
than report zero times in the table.  Finally, the phase timers would be 
off by
default (or at least many people would find value in turning them off).

We should still have a separate "header only" timers library that cam be 
used
to get process or real time.  This will increase consistency and fix 
existing
bugs as noted above.
------- Comment #2 From Bob Marinier 2008-06-15 17:58:13 (-) [reply] ------
-
It looks like there is already a freely available timers library:

http://synesis.com.au/software/stlsoft/doc-
1.9/group__group____library____performance.html

The license appears to be BSD compatible.  Strangely, the windows and unix
implementations are in different namespaces, but we can work around that
easily.  We may want to provide a thin wrapper around this, e.g. to get 
times
as doubles (the library only returns integer types for seconds, 
milliseconds,
etc., separately).

Original issue reported on code.google.com by voigtjr@gmail.com on 23 Jul 2009 at 4:54

GoogleCodeExporter commented 8 years ago

Original comment by voigtjr@gmail.com on 23 Jul 2009 at 5:29

Added labels: Imported-Bugzilla

GoogleCodeExporter commented 8 years ago

[deleted comment]

GoogleCodeExporter commented 8 years ago

Updated link in Bob's #2 comment:
[http://www.stlsoft.org/doc-1.9/group__group____library____performance.html 
link]

Original comment by voigtjr@gmail.com on 5 Oct 2009 at 4:35

GoogleCodeExporter commented 8 years ago

I ended up trying out the stlsoft library Bob recommended and it works great. I 
abstracted it in to a soar_timer interface and soar_timer_accumulator class 
that uses 
the timers (which vary according to platform). Things are on their own branch 
right now 
but will be folded in to the smem branch after a bit more testing. All three 
cases 
described above (kernel, time command, and connectionsml) all use the new 
timers now 
(they are included in a header only misc.h file in shared). I may move this to 
something like soar_timer.h.

Original comment by voigtjr@gmail.com on 8 Oct 2009 at 5:40

Changed state: Fixed

sleyzerzon / soar

improve timers throughout the code #31