improve timers throughout the code

ghost commented 10 years ago

From voigtjr@gmail.com on July 23, 2009 00:54:42

Description From Bob Marinier 2008-06-13 11:16:14 (-) [reply] Time sources are used in various places in Soar and SML, including:

1) Timers in the kernel itself, used to keep track of how long various parts of the code take to run (as reported by the stats command).

2) The "time" command, which reports how long a given command takes to execute.

3) In soar_thread::OSSpecificTimer (added on the gski-rem branch; used to profile connections and by TestSMLPerformance).

Each of these uses a different mechanism for tracking time, with different resolutions and consequences. For example, clock() has a low resolution, and it measures real time (as opposed to process time). In contrast, the Windows system call QueryPerformanceCounter is high resolution and measures process time (and thus is not affected by things like multi-processor systems and power saving features that change the processor frequency). Ideally, we want to use the same timing mechanisms throughout all the code, and it should be high resolution. It should measure real time or processor time as appropriate (e.g., the "time" command claims to report both).

Here's how times are currently tracked by various parts of the code:

1) Kernel times are tracked using getrusage on linux, and GetProcessTimes on windows. These both have low resolution and high call cost.

2) The "time" command reports processor time and real time. For processor time, it uses gettimeofday on linux and GetTickCount on windows. For real time it uses clock. As far as I can tell, all of these report real time (so the results of this command are incorrect). Clock and GetTickCount are low resolution. gettimeofday is affected by anything that changes the system clock (e.g., NTP services), and is also affected by anything that changes the processor frequency (e.g., power saving modes).

3) soar_thread::OSSpecificTimer uses gettimeofday on linux (with the problems as noted above) and QueryPerformanceCounter on windows, which has a high call cost, but is the highest resolution timer available on windows, and isn't affected by power saving features or changes to the system time, and shouldn't be affected by multiprocessor/core systems (barring BIOS bugs).

Choosing the "best" time source: It seems like there is no single time source that has no disadvantages. Furthermore, there is no cross-platform time source. On Windows, QueryPerformanceCounter is probably the best choice: it has the best resolution, and the overhead shouldn't be any worse than what we already have. On Linux, the highest resolution seems to be gettimeofday, which is unfortunate given all of its drawbacks. A possible alternative is to use RDTSC (assembly instruction), but this is only available on Intel machines, and suffers from various problems (see the MS and Intel references below).
Still, it may be the best we can do on Linux. For process time (as opposed to real time), it looks like we're stuck with GetProcessTimes and getrusage.

Regardless of the decisions we make here, we should have a single class for getting time information that is used by all of these different places. I recommend using the existing soar_thread::OSSpecificTimer (which will have to be moved out of ConnectionSML, but the thread stuff doesn't really belong there anyway). Then as we learn more we can update the code in one place, and everything will be consistent.

For places where the code runs so fast that timing is useless (e.g., timing some of Soar's phases), it may make sense to disable the timers altogether.
If the code appears to be running slow, people can turn the timers back on to investigate. As it is, timers introduce significant overhead in the kernel (~25% in the towers-of-hanoi demo, as measured with timers --off.
Admittedly, more complex code will be less affected).

References:

Increasing Timing Resolution for Processes and Threads in Linux http://ww2.cs.fsu.edu/~hines/present/timing_linux.pdf Win32 Performance Measurement Options http://www.ddj.com/windows/184416651 In particular: http://www.ddj.com/showArticle.jhtml? documentID=win0305a&pgno=18

Thread on gettimeofday resolution http://kerneltrap.org/mailarchive/linux-kernel/2008/4/10/1392304 Game Timing and Multicore Processors http://msdn.microsoft.com/en-us/library/bb173458(VS.85).aspx Measure Code Sections Using The Enhanced Timer http://softwarecommunity.intel.com/articles/eng/2589.htm ------- Comment #1 From Bob Marinier 2008-06-15 14:40:43 (-) [reply] ------ Another library to consider is boost::date_time. This uses gettimeofday on linux and ftime on windows. Both report real time and not process time.

I explored the effects of making the timers in the kernel itself header- only (to promote inlining). This made no difference in the times. However, running TestSoarPerformance with "timers --off" resulted in a 20% savings, which is consistent with the earlier test. Running TestSMLPerformance with "timers --off" resulted in a 6-8% savings, which shows that they have an impact even when the kernel isn't doing anything.

Given that timers appear to be inherently expensive to use, especially process timers, our best option may be to extend the timers command to give the user more control over which timers are enabled.

Currently, they can all be turned on or off. In my experience, people often use the kernel and total times reported by stats, but only look at the phase times if there is a problem. In the absence of a problem, the phase times are often lower than the resolution of the clock anyway, and thus don't report meaningful data.

An improvement would be to allow kernel and total timers to be on, but all other timers to be turned off. (Arguably, the flexibility to manipulate individual timers would be useful, but it seems like this would rarely be used). Furthermore, the stats command would report that a timer is off rather than report zero times in the table. Finally, the phase timers would be off by default (or at least many people would find value in turning them off).

We should still have a separate "header only" timers library that cam be used to get process or real time. This will increase consistency and fix existing bugs as noted above.

Original issue: http://code.google.com/p/soar/issues/detail?id=31

ghost commented 10 years ago

From Bob Marinier on June 15, 2008 17:58:13

It looks like there is already a freely available timers library: http://synesis.com.au/software/stlsoft/doc- 1.9/groupgroup__library____performance.html

The license appears to be BSD compatible. Strangely, the windows and unix implementations are in different namespaces, but we can work around that easily. We may want to provide a thin wrapper around this, e.g. to get times as doubles (the library only returns integer types for seconds, milliseconds, etc., separately).

ghost commented 10 years ago

From voigtjr@gmail.com on July 22, 2009 22:29:55

Labels: Imported-Bugzilla

ghost commented 10 years ago

From voigtjr@gmail.com on October 05, 2009 09:35:05

Updated link in Bob's #2 comment: [ http://www.stlsoft.org/doc-1.9/group__group____library____performance.html link]

ghost commented 10 years ago

From voigtjr@gmail.com on October 08, 2009 10:40:47

I ended up trying out the stlsoft library Bob recommended and it works great. I abstracted it in to a soar_timer interface and soar_timer_accumulator class that uses the timers (which vary according to platform). Things are on their own branch right now but will be folded in to the smem branch after a bit more testing. All three cases described above (kernel, time command, and connectionsml) all use the new timers now (they are included in a header only misc.h file in shared). I may move this to something like soar_timer.h.

Status: Fixed

SoarGroup / Soar

improve timers throughout the code #32